torygreen

vip
Age 2.8 Year
Peak Tier 0
No content yet
Something quietly inverted in AI compute this year, and it changes what the buildout is actually for.
In 2023, 2/3 of AI compute went to training, the actual work of building a model. The other, smaller slice went to inference, the work of actually running it once it's built. But that ratio quietly started flipping.
Inference is now 2/3 and still climbing, per Deloitte, and the chips built to run it crossed $50B this year.
The main reason this flip matters (and it's not percentage-wise): training and inference are different animals. Training happens in bursts, on one giant cluster, then it's d
post-image
  • Reward
  • Comment
  • Repost
  • Share
Two years ago an open model on this chart would've been near the bottom. The closed labs were generations ahead, and that gap was the whole reason people rented models instead of owning one.
Now GLM-5.2 sits at 51 on the @ArtificialAnlys index.
Open weights, Chinese lab, fifth overall. And knock Fable out of the list since it's not available, and the open-weights model is way closer to the top than its ranking lets on.
The pitch for closed was always the lead. Pay the API, accept the terms, build on something you don't control, because the model's far enough ahead to be worth it. That lead is
GLM5.33%
  • Reward
  • Comment
  • Repost
  • Share
Here's the split in AI compute that not many are reading correctly.
Frontier training is concentrating harder every quarter, thousands of GPUs that have to sit in one place wired together. But training is only 30% of demand in 2026. The other 70% is inference, and running it on a hyperscaler means paying for infrastructure built for the hardest workload to do the easiest one.
On distributed networks that same inference could run 45-75% cheaper and for anyone sizing an AI infrastructure budget, that gap is the whole story.
Training centralizes by necessity. Inference fragments because paying AW
post-image
  • Reward
  • Comment
  • Repost
  • Share
Been thinking about the recent GLM 5.2 news and the open weights angle everyone's running with but they're missing out on a completely different angle here.
Everyone's focused on the fact that a Chinese lab hit frontier-level performance and open-sourced it but the part worth sitting with is how. ZAI and the rest of Chinese labs were cut off from Nvidia in early 2025 so presumably no H100s, no H200s directly to them since then.
They crossed $128B on a model trained on probably Chinese silicon that lands within a few points of the frontier.
The export controls were meant to slow China down. Wha
GLM5.33%
  • Reward
  • Comment
  • Repost
  • Share
95% of enterprise GPU capacity is sitting idle right now.
That number comes from Cast AI measuring 23,000 real production clusters, not a generic survey.
Average utilization was 5% and it's happening at the exact moment Nvidia raised H200 prices 15%, the first increase in 20 years. The hardware everyone says is scarce is mostly doing nothing.
If you're trying to figure out why compute feels impossible to get, this is why. Nobody returns an allocation they waited months for. So the fleet sits at 5%, billed by the hour, and the scarcity feeds on itself. That seems like a coordination failure, n
post-image
  • Reward
  • Comment
  • Repost
  • Share
Some big EU AI policy moves reportedly coming but here's the infrastructure reality they're working with.
> EU sovereign AI infrastructure spend in 2026: $12.6 billion.
> US hyperscaler capex in the same year: $725 billion.
Europe spent six years building 19 AI Factories and 14 supercomputers and just Amazon alone will outspend that entire effort in two weeks this year.
Most European AI teams don't use European infrastructure. They rent from Virginia and Iowa and pay a GDPR compliance premium on top of the hyperscaler margin for the privilege. New Nvidia hardware reaches EU data centers 3 to 6
post-image
  • Reward
  • Comment
  • Repost
  • Share
I didn't expect this number to show up this year.
GitHub is on pace for 14 billion commits in 2026 so far. That's up from 1 billion in 2025. A 14x increase in a single year and most of it isn't humans writing code.
The load got so severe that Microsoft, which owns and runs the second largest cloud on earth, had to route traffic through AWS to keep the platform online. Nine service incidents in May alone. Availability dropped to 88.4%.
For every engineering team, infrastructure vendor, and cloud provider still sizing capacity for human-speed development the baseline just moved by an order of ma
post-image
  • Reward
  • Comment
  • Repost
  • Share
Nvidia’s revenue is the proof that “agentic compute” is not a theory. It is already on the income statement.
$26B four years ago. $215.9B last year. That 8x happened while most AI was still sitting in a chat box waiting for you to ask it a question.
The important part isn’t just the growth. It’s that Nvidia turned its architecture into the non‑negotiable input for almost everyone else’s roadmap. Labs, clouds, enterprises. Different logos on the API, same silicon underneath. Almost every dollar spent on AI infrastructure in this cycle leaked into their stack somewhere.
Now take Jensen’s claim t
post-image
  • Reward
  • Comment
  • Repost
  • Share
I didn't expect Goldman's five year number to be this large.
Five hyperscalers are projected to spend $5.3 trillion on AI infrastructure between 2025 and 2030. In 2022 they spent $162B combined.
This year they're on track for $725B. By 2027 analysts project $1 trillion in a single year.
For anyone building AI products or infrastructure outside these five balance sheets, this trajectory is the most important number in your planning assumptions.
The gap between what they can deploy and what everyone else can access compounds every year this continues.
  • Reward
  • Comment
  • Repost
  • Share
Everyone predicted AI would take over repetitive admin work first. The data says something different.
Decision-making is now 28% of workplace AI activity. The number one use case isn't automation. It's judgment.
People are using AI to analyze options, weigh tradeoffs, and support conclusions they're responsible for and that shift matters beyond the labor market question.
Judgment-based workloads run continuously, require more context per session, and don't batch efficiently.
The infrastructure requirements for an AI that helps you make decisions all day look nothing like the infrastructure for
post-image
  • Reward
  • 1
  • Repost
  • Share
Millionairetasks:
Great opportunity for everyone to be
Global cloud infrastructure Q1 2026. $129 billion in a single quarter. Growing 35% year over year.
The market is expanding fast but the concentration isn't changing. AWS, Azure, and Google Cloud held roughly the same share two years ago that they hold today, but the absolute gap between them and everyone else is wider in dollar terms than it has ever been.
That's the part the percentage chart doesn't show. The Others slice isn't growing into a real alternative. It's staying proportionally the same while the three hyperscalers add tens of billions in absolute revenue every quarter.
The window f
post-image
  • Reward
  • Comment
  • Repost
  • Share
PJM runs the electric grid across 13 US states and 65 million people. It's the largest competitive wholesale electricity market in the world.
Its capacity market clearing price, the rate that signals whether future power supply can meet demand, has gone from $28.92 per MW in 2024 to $329.17 in 2026. Two auction cycles.
Data center demand identified as the primary driver. The 2027/2028 auction just cleared at $333.44, with PJM directly attributing 5,100 MW of the load increase to data centers.
That's not a supply shock or a geopolitical event. It's AI build-out hitting a grid that wasn't design
post-image
  • Reward
  • 2
  • Repost
  • Share
DiveNate:
2026 GOGOGO 👊
View More
Two numbers from this chart.
AI API price: down 96% since 2022.
Hyperscaler capex: up 12x in the same window.
Most people see the first number and call it democratization but nobody is building a strategy around the second one.
That's not a coincidence. That's a structural capture play.
Every AI startup celebrating cheap models is running on compute they don't own, on infrastructure they can't replicate, controlled by three companies.
Sovereign AI starts with sovereign infrastructure. Everything else is just a better priced dependency.
post-image
  • Reward
  • 1
  • Repost
  • Share
Yuhuan:
gsgsgshsokzkzkzkxhxj
The thing Friday revealed isn't that governments can shut down AI models.
It's that the entire global user base of the world's most capable models sits behind a single operational decision by a single company responding to a single directive. No redundancy or warning.
Three of the largest AI companies currently control 88% of frontier AI access and one compliance surface for all of it.
What Friday made visible is that when compute and model access sit inside a handful of companies, the entire stack inherits their single point of failure. That's not an argument against centralized AI. Both mode
post-image
  • Reward
  • Comment
  • Repost
  • Share
In 2024, the AI compute map had two superpowers. US at 53.7 GW, China at 31.9 GW.
In 2026, China is at 2.5 GW.
That's a controlled demolition of a nation's AI infrastructure capacity through export policy. No bombs, no sanctions, just chip rules.
What this proves is that compute is now a geopolitical weapon. Any country that doesn't own its infrastructure doesn't want to find out what being on the receiving end of that weapon feels like.
The question isn't whether decentralized compute wins. It's whether it arrives before the next policy decision restructures the map again.
post-image
  • Reward
  • Comment
  • Repost
  • Share
The largest tech IPO of the 2000s was Visa at $28B. The largest of the 2010s was Alibaba at $168B. Roughly 6x per decade.
Now extend the line. OpenAI and Anthropic each sit at $1T even before listing.
If you add up the biggest tech debuts of the last 25 years. Alibaba, Facebook, Uber, Rivian, Snowflake, Palantir, Cerebras, CoreWeave, all of them. You get roughly $800B.
OpenAI + Anthropic alone are worth nearly $2T. Still private. 2.5x bigger than a quarter century of Wall Street's biggest listings, combined.
But the biggest structural difference is that the likes of Visa and Alibaba and all th
  • Reward
  • Comment
  • Repost
  • Share
For most of history, capital scaled through machines.
Now it scales through cognition.
A startup can wake up with the equivalent of a million analysts, researchers, coders, and strategists running in parallel at near-zero marginal cost.
The AI revolution is unlike any previous technical revolution.
  • Reward
  • Comment
  • Repost
  • Share
you can't raise on an open charter and treat the open part as optional once the money shows up
the trial started on a question: can you charter a nonprofit, call openai your mission, attract 10 years of mission-driven engineers and donation capital on that promise, then convert to a profit-capped structure and call it an evolution?
elon left openai in 2018. the $130b in damages he's asking for goes to the nonprofit. whatever you think of him as a litigant, the question the case forces into court record is the right one: does a charitable trust have enforceable claims when the founding mission
  • Reward
  • Comment
  • Repost
  • Share
nvidia is now bigger than japan's entire economy and your AI bill is the reason
every dollar you spend on AI right now runs through one company's chips, on three clouds that resell them at a markup
> ai startups burn ~80% of their raised capital just to rent compute
> i've seen seed-stage teams paying $700k/month for a single chip vendor
> data centers are running at 12-18% capacity while your bill goes up every quarter
the entire industry just agreed to stand in one line and hand money to the same toll booth
there's idle compute in gaming rigs, old mining hardware, and half-empty data centers
  • Reward
  • Comment
  • Repost
  • Share
the ai-is-overbuilt crowd has never tried to buy an h100 this quarter
spending a week trying to buy h100s right now means: 12-month commit at aws (24/7 utilization locked in before you see a single gpu), gcp waitlist with no eta, lambda and coreweave both sold out, every smaller provider giving you the same answer in different words
hyperscaler construction is measured in years, cpu shortages are stalling the gpus that do exist, and demand continues to grow while the hyperscalers file permits
seed-stage ai teams are spending 70-80% of their runway on compute before a single user touches the pr
  • Reward
  • Comment
  • Repost
  • Share
  • Pinned