Everyone’s fighting over GPUs. Alibaba just changed the question.

On Tuesday, Alibaba’s DAMO Academy unveiled the XuanTie C950 — a 5-nanometer server processor built on open-source RISC-V architecture. It’s the highest-performing RISC-V CPU ever made. But the interesting part isn’t the benchmarks. It’s the thesis behind the chip: that AI agents need fundamentally different silicon than AI chatbots.

While Nvidia, AMD, and Intel wage war over who can build the biggest parallel processor for training models, Alibaba is making a deliberate bet on what comes after training. And the logic is harder to dismiss than you’d think.

The Specs That Matter

The C950 runs at 3.2 GHz across multiple 64-bit cores on a 5nm process. It delivers over 3x the performance of its predecessor, the C920, and scores roughly on par with Apple’s M1 chip on SPECint 2006 benchmarks.

That sounds modest until you realize: this is a RISC-V chip hitting M1-tier performance. That’s never happened before.

Under the hood, a custom Tensor Processing Engine delivers 8 TOPS of inference compute, supporting everything from FP16 down to INT4/FP8 and emerging micro-scaling formats. The memory subsystem is built for speed — 4-cycle L1 data cache latency, private per-core L2, and multi-processor clusters via Alibaba’s XL-300 interconnect.

DAMO Academy says the chip natively supports inference for models with hundreds of billions of parameters, including Qwen3 and DeepSeek V3. This isn’t a general-purpose CPU with AI bolted on. It’s a CPU built to run massive language models.

Why RISC-V Changes the Game

Here’s where strategy meets silicon. RISC-V is open-source — no licensing fees, no dependency on Arm or Intel’s architectures. Anyone can use it. Anyone can customize it.

That “customize” part is the key. Unlike proprietary instruction sets where you take what the licensor gives you, RISC-V lets designers add custom extensions optimized for specific workloads. For AI inference — especially the sequential, branching reasoning that agents perform — this flexibility is a structural advantage.

It’s also a geopolitical move. With U.S. export restrictions choking Chinese access to Nvidia’s best GPUs, building world-class silicon on an open-source architecture isn’t just smart engineering. It’s technological self-determination.

The Agent Inference Thesis

This is the core argument, and it’s worth taking seriously.

GPUs dominate AI training because training is a massively parallel operation — billions of matrix multiplications that benefit from thousands of cores running simultaneously. Nvidia built an empire on this reality.

But agent inference is a different beast. When an AI agent books a flight, debugs code, or researches a topic, it doesn’t just generate text in one shot. It plans. It executes steps sequentially. It calls tools, evaluates results, and adjusts. This workflow involves branching logic, sequential decisions, and frequent memory access — patterns that look more like CPU workloads than GPU workloads.

Alibaba is betting that as we shift from “AI as chatbot” to “AI as autonomous worker,” the silicon requirements shift too. CPUs optimized for inference — especially ones customizable for specific model architectures — could carve out serious market share alongside training-focused GPUs.

Even Jensen Huang acknowledged at GTC earlier this month that CPUs and GPUs serve complementary roles. The question is who builds the best CPU for the agent era.

Full-Stack, Not Just a Chip

The C950 is one piece of a larger play. Alibaba is building an integrated agentic AI stack:

  • Wukong — an enterprise platform optimized for agent workflows in China
  • Accio Work — its international equivalent, promising autonomous business operations for SMEs
  • Alibaba Token Hub — a new division reorganizing AI teams around enterprise work platforms
  • Zhenwu 810E — handles AI training, while XuanTie covers cloud and agentic workloads

Custom silicon running proprietary models on a proprietary cloud, delivered through purpose-built agent platforms. It’s the Apple playbook applied to enterprise AI — and Alibaba CEO Yongming Wu explicitly framed it that way, calling for “more profound co-design” across hardware, infrastructure, and models.

Morningstar analyst Chelsey Tam offered a measured take: the chip’s importance “lies primarily in improving supply chain resilience amid scarce computing power and lowering overall costs.” But she cautioned that capacity constraints limit near-term scaling.

The Hardware Race Is On

Alibaba isn’t alone in this thesis. Hours after the C950 announcement, Arm revealed its own AI-focused CPU with 136 Neoverse V3 cores, claiming double x86 performance for agentic workloads. The agent hardware race is officially underway.

And in a perfect bit of irony: a research team called Verkor recently demonstrated an AI agent that can autonomously design RISC-V CPUs from a one-paragraph spec. AI designing the chips that run AI. We’ve hit the ouroboros stage of the hardware cycle.

The competitive landscape is clear. Nvidia owns training. Everyone else is fighting for inference — especially the sequential, reasoning-heavy inference that agents demand. RISC-V’s customizability gives it a structural edge, and Alibaba’s head start with the C950 could prove significant if the agentic AI thesis plays out.

What This Actually Means

The era of “just throw more GPUs at it” is ending. Different workloads are starting to demand different silicon, and the companies that recognize this shift early will have an advantage.

For enterprises, Alibaba’s integrated approach — custom chips, custom models, custom platforms — offers a compelling alternative to the Western mix-and-match model. For the industry, the C950 proves RISC-V has graduated from academic project to production-grade AI hardware.

And for anyone watching the U.S.-China tech competition: Alibaba just demonstrated that Chinese firms can build world-class AI processors without Western technology, using open-source architecture and vertical integration.

The AI agent era needs its own silicon. The starting gun just fired.