Nvidia has owned the AI chip conversation for three years straight. That era might be ending — not with a bang, but with a procurement spreadsheet.

Over the past 48 hours, Google quietly revealed the most aggressive custom silicon strategy in the industry: a four-company chip design alliance spanning Broadcom, MediaTek, Marvell, and Intel. Each partner handles a different piece of the puzzle. Together, they represent something Nvidia should lose sleep over.

The Four-Partner Playbook

Google has been building its own AI chips since 2015, longer than anyone else in Big Tech. But until now, it leaned almost entirely on Broadcom for TPU design. That dependency just shattered.

Here’s how the new lineup works:

  • Broadcom keeps the crown jewels — high-performance training chips. Its next-gen TPU v8 training chip, codenamed “Sunfish,” targets TSMC’s 2-nanometer process for late 2027. A supply agreement locks this partnership through 2031.
  • MediaTek builds the cost-optimized inference variant, codenamed “Zebrafish,” also at 2nm. Reports peg MediaTek’s designs at 20-30% cheaper than alternatives.
  • Marvell is developing a memory processing unit plus an additional inference TPU. Its stock jumped nearly 6% on the news. Broadcom dipped 2%.
  • Intel rounds things out as a fabrication and design resource.

The strategic logic is ruthless: split training and inference across competing partners so neither has leverage. Negotiating 101, executed at planetary scale.

Inference Is Where the Money Burns

Here’s what most coverage misses about the AI chip race: training a model is a one-time event. Running it for hundreds of millions of users every second is where compute costs actually pile up.

Every chatbot query, every AI-refined search result, every generated image — that’s an inference event. At Google’s scale, inference events hit millions per minute. The company doesn’t just need fast chips. It needs cheap, efficient chips available in absurd quantities.

Google’s latest Ironwood TPU (7th generation) was built for exactly this reality. The specs are aggressive: 10x the peak performance of TPU v5p, 192 GB of HBM3E memory per chip, 7.2 TB/s bandwidth, and the ability to scale to 9,216 liquid-cooled chips in a single superpod producing 42.5 FP8 exaflops.

Google plans to produce millions of Ironwood units this year. Not thousands. Millions.

Everyone’s Building Custom — Because They Have To

Google moved first, but the pattern is industry-wide now. Every major tech company has reached the same conclusion: depending on a single supplier for your most critical resource is a strategic vulnerability you can’t afford.

Meta just committed to deploying 1 gigawatt of custom MTIA chips using Broadcom technology. Amazon builds Trainium and Inferentia for AWS. Microsoft has Maia for Azure AI. Tesla completed tape-out of its AI5 chip this week.

The custom ASIC market is projected to grow 45% in 2026 and hit $118 billion by 2033. Nvidia still dominates training with roughly 80% market share. But inference? That’s where the walls are cracking.

The $25 Billion Web of Partnerships

The same day the Marvell news broke, Amazon announced up to $25 billion more in Anthropic, the company behind Claude. In return, Anthropic committed to spending over $100 billion on Amazon’s cloud tech over the next decade.

But Anthropic has also committed to using up to one million Google TPUs. Meta rents TPU capacity from Google too.

These aren’t contradictions. They’re reflections of a market where demand so massively outstrips supply that everyone needs compute from everywhere. The AI infrastructure layer has become a web of cross-cutting partnerships where everyone is simultaneously partner and competitor.

Google builds chips that Anthropic uses. Amazon funds Anthropic. Amazon builds its own competing chips. Welcome to the AI economy.

What This Actually Means

Costs come down. More competition in silicon means lower compute costs, which means cheaper AI services. MediaTek’s 20-30% cheaper inference chips are just the opening bid.

AI gets faster. Purpose-built inference chips are optimized specifically for the workloads powering your daily AI interactions. Expect snappier responses and more complex real-time features.

Supply chains get resilient. The days of a single chip shortage crippling all of AI are (slowly) ending. Google’s four-partner template will spread.

Cloud wars are chip wars now. Choosing a cloud provider for AI workloads increasingly means choosing their underlying silicon. Google’s Ironwood, Amazon’s Trainium, Azure’s Maia — each has different strengths. Whoever builds the best custom chips wins the cloud AI race.

The Bottleneck Nobody’s Solved

Google Cloud Next is happening this week, and all eyes are on whether Google officially announces the Marvell deal and reveals TPU v8 details. But the real constraint isn’t announcements — it’s physics.

RBC Capital flagged near-term pressure from limited 3nm wafer availability at TSMC, which could bottleneck everyone’s ambitions. The chip designs are bold. Manufacturing capacity still sets the pace.

The broader trend, though, is undeniable: 2026 is the year AI infrastructure goes from “Nvidia and some pretenders” to a genuine multi-player chip ecosystem. Google’s four-partner strategy isn’t a procurement decision. It’s a declaration that Nvidia’s unchallenged dominance has an expiration date.

The only question is whether custom silicon scales fast enough to match the insatiable demand for AI compute. Based on what we’re seeing this week, Big Tech is betting everything that it will.