Five Source Capital partner Meng Xing has recently published a Silicon Valley inspection report, proposing a judgment that even made him change the way he takes notes: Silicon Valley is entering a stage where even people who know how to ride waves are getting drowned by them. The iteration speed of AI has moved from “monthly” to “weekly”—even Silicon Valley itself can’t keep up with itself.

When AI amplifies a team’s productivity by five times, you can cut back 80% of headcount to maintain the same output, or keep the same number of people and do five times the work. Meng Xing’s observations this time are essentially a draft answer given on the ground: when a 100x efficiency increase doesn’t translate into 100x revenue, when the token budget is nearing labor costs, and when the steam engine can’t run faster than the horse-drawn carriage yet no one dares to stop, Silicon Valley is choosing “push speed higher first.” But where this path will ultimately lead—toward “expansion capability” or “cost compression”—has not been decided yet.

YC shifting from leading indicators to lagging indicators

In March this year, Meng Xing sat in the audience at YC W26 batch Demo Day and, when he heard the fifth company’s pitch, he put down his notebook. The reason: among the over 100 companies in this batch, about 80% are doing vertical agents—for example, helping lawyers organize documents, helping customer service route tickets, and helping HR screen resumes.

If he had seen these topics back in October last year, he would have thought, “Pretty interesting.” But after Claude Code turned from a developer tool into an interface that almost everyone can use, and after Opus 4.6 lowered the threshold for vibe coding to the floor, these vertical agents could be copied by an ordinary engineer in a single weekend before any real business moat had been established.

YC’s batch system—from application, screening, entering the program, polishing, to demo day—is designed for a slower world. At the current iteration speed of AI, five months is enough for multiple rounds of paradigm shifts. Meng Xing describes it as: YC is gradually turning from leading indicators into lagging indicators.

Meta writes code using the opponent’s product

The biggest shock from Meng Xing’s Silicon Valley trip is that the entire company at Meta is using Claude Code. For a company with a market valuation in the trillions, having tens of thousands of engineers use a competitor’s API to access the code of their own company is something that would have been completely unthinkable half a year ago.

Meta previously rolled out a tool called myclaw to try to address code security issues, but “it wasn’t good, and nobody used it.” In the end, the company simply loosened its policy: as long as it didn’t involve customer data, employees could freely use Claude Code, and they began holding internal meetings and trainings on “how to become an AI-native organization.”

Google, for security reasons, basically prohibits employees from using competitor tools, but DeepMind is an exception. Several teams responsible for Gemini and internal applications use Claude Code. Google’s own internal coding tool, Antigravity, claims that currently about 50% of new code is written by AI, yet it still can’t stop DeepMind’s preference.

One key factor is that Anthropic provides a private deployment, and Anthropic’s inference and training are already running in large proportions on Google Cloud’s TPUs—there’s an underlying foundation of trust; other big companies don’t have that relationship. They’re genuinely putting code security aside for now and pushing speed up first.

An engineer’s token spending is more expensive than the engineer themself

Among several AI-native startups Meng Xing visited while in Palo Alto, the annual token budget for a single engineer is about a few hundred thousand USD—close to one engineer’s annual salary. It looks like companies are cutting headcount with AI to save money, but in reality total costs may not drop at all; they’re simply swapping people’s costs for token costs.

Meta takes this to the extreme: internally it sets up a token consumption leaderboard—who uses more gets on the list, and those at the bottom may be laid off. For that, employees have competed hard to earn an unofficial title called “token legend.” But in the same period, Meta carried out two rounds of layoffs in succession, totaling tens of thousands of people. It isn’t contradictory to rush on tokens while laying off large numbers of staff—this is two sides of the same thing.

Meng Xing also visited a C-round company in person. The technical lead opened Slack to show him—everything was agents running. Behind the scenes, a dozen-plus Cursor agents were running in parallel; then another Claude Code window was opened to handle orchestration. The most popular kind of anxiety among engineers is: before going to sleep, not knowing what those dozen-plus agents of mine are going to do—so you’re really panicking.

100x efficiency, without 100x revenue

Many CTOs excitedly tell Meng Xing, “Back then, 60 people took one year to do what now 2 people plus Claude Code can finish in one week”—the so-called “100x engineers,” “10x efficiency improvement.”

But after Meng Xing calmed down, he asked one question: okay, if efficiency increased by 100x, did the company’s revenue grow by 100x? Did the product lines expand by 100x? He didn’t get a positive answer. The fact is: when a 100x productivity increase lands on revenue, it often only shows up as 50% or 1x. Where the gap is—right now, no one can explain clearly.

“With so many tokens used, the company should be mutating into a different kind of company—but into what, I don’t know.” That’s what a founder told him. Even Anthropic itself has scenarios where they can’t keep up. Meng Xing asked an Anthropic friend, “What’s the most painful scenario for you when you use agents yourselves?” The answer was oncall real-time response.

When API responses get slow, inference nodes go down, or user feedback outputs are abnormal, oncall engineers need to quickly determine whether the issue is a code bug, a compute allocation problem, or something wrong with the model itself. Anthropic is the strongest coding-agent company in the world—this scenario is as close to their core capabilities as it can possibly be—yet their internal oncall agents are still not good enough to be usable.

The steam engine runs faster than the horse-drawn carriage—yet no one dares to stop

Meng Xing describes the current state: the steam engine has already been invented, but sometimes it doesn’t even run faster than the horse-drawn carriage. The key is that everyone knows the steam engine will eventually run faster, so code security doesn’t matter anymore, the token budget explodes, and the leaderboard competition ramps up. As for when the steam engine will truly run past the horse-drawn carriage, nobody knows—but no one dares to stop and wait for that day, because the cost of stopping might be bigger than burning the wrong tokens.

And token consumption is very likely not growing linearly. Meng Xing cites data from the research institution METR: the metric measuring how long an AI agent can complete tasks with a 50% success rate (measured by how long it takes human experts to complete). In March 2025, even Claude 3.7 Sonnet was 50 minutes; by the end of 2025, Claude Opus 4.6 had already reached 14.5 hours.

Over the past two years, this metric’s doubling cycle compressed from 7 months to 4 months. Once agent reliability climbs to another level, token consumption stops being a “problem of adding 50% each year,” and instead is a step up by an order of magnitude overnight. Meng Xing also mentioned a consensus prediction in his朋友圈: by the end of this year, many companies (including tech giants) would actually only need 20% of people.

(Answer a question: when AI boosts your efficiency by five times, do you reduce costs by 80%, or do you do five times the work?)

In April this year, the writer asked a question in an article: when AI amplifies a team’s productivity by five times, you can reduce headcount by 80% to maintain the original output, or keep headcount and do five times the work. Aaron Levie, on an a16z Podcast, suggested that in the future, the number of a company’s agents could be 100 to 1,000 times the number of employees; Huang Renxun also said plainly that if the world has no new ideas, then the productivity gains brought by AI will ultimately turn into unemployment. The problem isn’t AI—it’s whether decision-makers have imagination.

Meng Xing’s Silicon Valley observation this time is essentially a draft of the answer given on the ground: when a 100x efficiency increase doesn’t translate into 100x revenue, when the token budget is逼近 labor costs, and when the steam engine can’t run faster than the horse-drawn carriage yet no one dares to stop, Silicon Valley is choosing “push speed up first.” But whether this path ultimately leads to “expansion capability” or “cost compression” has not been decided yet.

At the end of his article, Meng Xing leaves behind a more balanced perspective: seeing so many “can’t keep up” cases in the past half month really does make people anxious—but if AI really can turn cancer into a chronic disease within a few years, and accelerate materials science by twenty years, then this “can’t keep up” might be the biggest acceleration in human development history.

For corporate decision-makers, the real question is never whether AI will replace people. Rather, after productivity is amplified by five times, ten times, and even a hundred times, you choose to use it to lay off more people, or to do more things. This choice is happening at the same time in Silicon Valley and in corporate boardrooms across the globe.

This article “Using AI to increase output or reduce costs? 100x efficiency didn’t bring 100x revenue, but no one dares to call a stop in Silicon Valley” was first published on 鏈新聞 ABMedia.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.