Google and Meta CEOs both out in last 24 hours now agreeing with my AI Arms Race narrative.
There are 4 major players in this game theory problem, and two are now on-the-record about their strategies.
Since starting at
@sequoia
, I've been reflecting on the massive CapEx buildout in AI.
For today's GPUs to payback, $200B of revenue will need to be generated per year of CapEx.
Long-term, this is good for startups. Short-term, it could get messy.
With Character dropping out, the qualifying round of the AI race is now officially over.
Congrats to the finalists: Microsoft/OpenAI, Amazon/Microsoft, Google, Meta and xAI.
The next phase begins: One defined more by the construction worker than the research scientist.
NVDA was asked our $600B question on Q2 call: Where is the customer's customer's revenue?
Their answer:
1. Traditional workloads will move from CPU to GPU
2. ChatGPT and coding AI
3. Meta saves $ using GPUs for algos
4. Countries buying GPUs
Is this enough to justify the hype?
Nvidia just became the most valuable company in the world.
In September, I published AI’s $200B Question.
The question I had then was: “Where is all the AI revenue?
Below is an update on what’s changed & what hasn’t:
Morning after last night's "AI Super Bowl."
Here's where we stand on AI's $600B question:
(1 & 2) Nvidia: CPU to GPU
(3 & 4) Meta: AI game theory
(5 & 6) Google: Underinvesting vs. overinvesting
(7 & 8) Microsoft: M365 & Github copilots
(9 & 10) Amazon: Data center logistics
Here’s the question now being asked all across the AI ecosystem: Is there a way for someone else to take on the demand risk from AI, while I capture the profits? (1/3)
The next big catalyst in AI is the launch of 100k clusters. Everyone has committed to this path.
If scaling laws hold at 100k, we'll get another leg of CapEx. If not, things will slow down.
Given physical requirements, this is ~1+ years away. We may be in limbo for a while.
Big day in the market. Zuck & Sundar acknowledged how much AI CapEx is FOMO vs. actual revenue. Public market grappling with updating their AI narrative.
Full breakdown here:
New post: What does it take to win in the next phase of AI? Our answer: Steel, servers & power.
Today, 5 companies have qualified for AI's next race: The race for data center scale-up.
The market structure has now crystallized and the gritty execution phase begins.
Imagine you knew for certain that AI was going to be as transformational as the internet, and that you control the only AI company in the world. How fast would you build CapEx? (1/11)
Thanks
@HarryStebbings
for a great discussion & deep dive on AI's $600B question:
👉 The revenue hole that needs to get filled
👉 The AI arms race and cloud oligopoly
👉 Why overbuilding is good for startups
👉 Steel, servers & power
👉 The need for a cash machine to fund AI
I’ve been fortunate enough to do AI shows with
@Sama
,
@ylecun
&
@arthurmensch
.
But this show today with
@DavidCahn6
at
@sequoia
is the best of all of them:
🧠 AI’s $600BN question
🏃 Why no one will never train a frontier model on the same data center twice
🏗️ Servers, steel
Silicon Valley is obsessed with AGI.
But AGI is not the only path to a better future.
Bringing 1B developers into the software economy could rival AGI in terms of its economic impact – and it’s already happening.
Next post in the 200b / 600b series turns from the revenue side of the equation to the cost side. We look at the data center construction boom and make 5 predictions for how this will affect the energy sector and the economy more broadly. (1/7)
For the last year, people have been thinking "Google is spending $XB on CapEx, they must know something we don't know about the future of AI."
What Sundar confirmed yesterday is they don't know anything the public doesn't already know -- they just have different incentives.
AMD announcement today of the ZT acquisition reflects new reality in the AI supply chain.
Now that the competitive landscape has hardened, everyone is looking at profit margins & demand risk.
ZT improves both margin capture and customer acquisition.
ChatGPT was AI's "Big Bang" in late 2022.
2023 was a frenzy. There was a view that all the hard problems had been solved, and AGI is inevitable.
2024 will be about facing down ambiguity and seeking out hard problems. This will unleash AI's impact.
In part one of this piece, we’ll walk through the tug of war between supply chain players over risk and profit. In part two, we’ll unpack the instability of today’s equilibrium. (3/3)
It has been alarming and heartbreaking to watch the recent events unfold in Israel. The terrorist attacks by Hamas on innocent civilians are reprehensible. We have been in touch with our founders, former partners, and companies with operations in the region to see how we can best
What makes a great founder?
This framework explores the founder journey -- starting with hardcore science & ambition, and breaking down how that can evolve into invention & visionary thinking with the power of intuition.
Today, Big Tech companies have stepped up to alleviate some of this tension. They are acting as risk-absorbers within the system, taking on as much demand risk as they possibly can, and driving the supply chain toward greater and greater CapEx escalation. (2/3)
h/t
@amasad
@kiwicopple
@AntWilson
who have greatly influenced my thinking on this topic.
@amasad
was one of the first to articulate the 1B goal and
@kiwicopple
@AntWilson
had the vision and courage to envision a world where developers can build in a weekend & scale to millions.
Our message to founders today? Focus on the fundamentals.
@roelofbotha
spoke with
@emilychangtv
about thinking long-term, building in a challenging market, our enthusiasm about the European startup ecosystems, AI investments, and more.
With CapEx plans now firmly in place and the competitive landscape set, the new AI era begins. In this new phase of AI, steel, servers and power will replace models, compute and data as the “must-wins” for anyone hoping to pull ahead.
Five companies have arrived at the starting line in this new race toward data center scale-up: Microsoft/OpenAI, Amazon/Anthropic, Google, Meta and xAI. Each has a model that has held up against serious benchmarks, and the necessary capital to proceed.
Up until now, you could fit your training cluster into an existing data center via colocation or retrofit.
This is changing: The next generation of models are aiming for 300k GPUs clusters. To house one of these models, you need to build an entire new data center.
Many market participants today would have you believe that there is a choice between being an “AI bull” and an “AI bear.” This is a false dichotomy. The CapEx debate is a debate about speed, not about magnitude (3/11)
No matter what you believe, all of this building will be fantastic for startups. Infrastructure providers are bearing most of the risk in AI. This is a subsidy for startups building on top of them. With all this investment, training & inference costs will keep coming down (8/11)
Today’s CapEx will likely yield fruit somewhere between late 2025 and early 2026, at which point we’ll find out if these larger models are intelligent enough to unlock new revenue streams and generate a return on investment.
The race to model parity has been the defining project of the last 12 months in AI.
The next phase in the AI race is going to look different: It will be defined more by physical construction than by scientific discovery.
Morning headliners from both Axios and Tom Wigg today on the “AI CapEx Watch” as everyone is on their toes this week ahead of earnings.
We should have more info soon on Q2 AI CapEx levels, CEO commentary on ROI, and broader game theory dynamics playing out in the market.
I believe the answer is: You would take your time. You’d see how liquid cooling systems perform and alter your data center designs. You’d build new power assets in the right locations, and then build your data centers near fiber optic cables (2/11)
This changes AI in two fundamental ways:
1/ It changes the lead time between models. If before you could train your model in 6 to 12 months, now you need to add 18 to 24 months of construction time.
2/ It changes the source of maximum competitive advantage. In the new era,
Meta Answer: AI Game Theory
Zuck: "There’s a meaningful chance that a lot of the companies are overbuilding now...[this is] a rational decision, because the downside of being behind is that you are out of position for the most important technology.”
Amazon Answer: Data Center Experience
Jassy: "Most companies deliver more capacity than they need...if you actually deliver too much capacity, the economics are pretty woeful...you can tell from...our operating income in AWS that we've learned over time to manage this."
In fact, the more you believe in AI, the more you might be concerned that AI model progress will outpace physical infrastructure. For example, once everyone has 100k clusters, big tech companies will need to figure out what to do with their 50k and 25k clusters (4/11)
When the history of this AI wave is written, the awesome scale of Microsoft, Amazon, and Google will be a big part of it. Without these companies and their fortress balance sheets – we would be significantly farther behind than we are today (9/11)
Nvidia's Answer: CPU workloads moving to GPU
Jensen: "The answer is accelerated computing. We know that accelerated computing, of course, speeds up applications. It also enables you to do computing at a much larger scale."
I've heard a few industry experts make comments along the lines of: No one will ever train a frontier model on the same data center twice—by the time the model has been trained, the GPUs will have become outdated, and frontier cluster sizes will have grown (5/11)
The key to understanding the pace of today’s infrastructure buildout is to recognize that while AI optimism is certainly a driver of AI CapEx, it is not the only one. The cloud players exist in a ruthless oligopoly with intense competition (6/11)
The cloud business today is a $250B market – the size of the entire SaaS sector, combined. The cloud giants are collectively worth more than $7T – this is one of the most powerful oligopolies in the history of business. Big tech is defending it's oligopoly (7/11)
1. AI will catalyze an energy transformation. New solar construction, battery innovation, a resurgence in nuclear energy—these will be long-term effects of the AI wave (2/7)
5. When new data center capacity comes online, the cost of training and inference delivered by AWS, Azure and GCP will go down, to the benefit of startups (6/7)
4. The industrial capacity needed to build new AI data centers will serve as an economic stimulus and create jobs in the real economy: Steel, energy, trucking and construction (5/7)
Microsoft Answer: M365 & Github Copilot, Azure AI
Satya: "M365 Copilot as perhaps the best Office 365 or M365 suite we have had...GitHub Copilot now being bigger than even GitHub when we bought it...and obviously, the Azure AI growth, that's the first place we look at."
Google Answer: Risk of Underinvesting > Risk of Overinvesting
Sundar: "The risk of under-investing is dramatically greater than the risk of over-investing for us here, even in scenarios where if it turns out that we are over-investing."
Microsoft Question:
Morgan Stanley: "Can you give us a little bit more help in understanding the timing between the CapEx investments and the yield on those investments?"
3. Starting in the next 6 months, there will be a lot of headlines about delays in data center builds due to issues with liquid cooling, cluster size and power access (4/7)
2. Some hyperscalers will find that they are not nimble enough to address rapidly changing data center requirements—new industrial AI players will emerge to fill this gap (3/7)
Amazon Question:
Goldman: "Potential to overinvest as opposed to under invest in AI...[do] you have a perspective on that in terms of thinking about elements of capitalizing on the theme longer term against the potential for pace and cadence of investment on AWS?”