Nvidia

Abstract visualization of disaggregated AI inference architecture

AWS and Cerebras Are Ripping AI Inference Apart — On Purpose

The biggest bottleneck in AI isn’t training anymore. It’s inference — the moment a model actually does something useful. And AWS just partnered with Cerebras Systems to attack that bottleneck with an approach nobody has tried at this scale. The deal: Cerebras’ massive wafer-scale CS-3 chips will sit inside AWS data centers, accessible through Amazon Bedrock. The promise: 5x faster inference. The method: tearing the inference pipeline in half. Splitting the Brain Traditional AI inference runs both stages on the same GPU. You send a prompt, the chip processes it (prefill), then generates a response token by token (decode). One chip, both jobs. ...

Abstract illustration of AI chip smuggling and export controls

Hair Dryers and Dummy Servers: Inside the $2.5 Billion Nvidia Chip Smuggling Bust

Federal agents arrested Super Micro Computer co-founder Wally Liaw on Thursday for allegedly running a $2.5 billion scheme to smuggle Nvidia-powered AI servers to China. The playbook included dummy servers staged in warehouses, hair dryers to peel off serial numbers, and a bribed auditor who skipped inspections to enjoy paid entertainment. This is the biggest AI export control enforcement action in U.S. history. And it reads like a heist movie. ...

Abstract visualization of Nvidia's Vera Rubin AI chip architecture

Nvidia GTC 2026: Vera Rubin, a $1 Trillion Bet, and the Dawn of AI's Inference Era

Jensen Huang stood in front of 18,000 people at San Jose’s SAP Center on Monday, wearing his signature black leather jacket, and casually dropped a number that would make most Fortune 500 CEOs choke on their coffee: $1 trillion. That’s the revenue opportunity Nvidia now sees for its AI chips through 2027 — doubled from the $500 billion estimate it gave investors just last month. And after a nearly three-hour keynote that covered everything from space-based data centers to Disney robots to the future of gaming graphics, one thing is crystal clear: Nvidia isn’t just riding the AI wave anymore. It’s building the ocean. ...

Abstract visualization of AI infrastructure network with data center nodes

Meta Just Bet $27 Billion on a Company You've Probably Never Heard Of

The AI race has a new front, and it’s not about who builds the smartest model. It’s about who controls the pipes. Meta just signed a $27 billion, five-year infrastructure deal with Nebius Group — a company that, two years ago, was busy shedding its identity as Yandex’s international arm. The deal gives Meta priority access to purpose-built GPU clusters running Nvidia’s next-generation Vera Rubin chips. And it tells us something crucial about where the AI industry is actually heading. ...

Abstract visualization of NVIDIA's trillion-dollar AI inference infrastructure

NVIDIA's $1 Trillion Bet: GTC 2026 Reveals the Age of AI Agents, Space Data Centers, and Inference Dominance

Jensen Huang walked into a packed San Jose hockey arena, leather jacket and all, and casually doubled NVIDIA’s AI revenue forecast to $1 trillion through 2027. Then he announced chips purpose-built for AI agents, a partnership with a former competitor, Disney robots, self-driving car deals, and — because apparently Earth isn’t big enough — data centers in space. GTC 2026 wasn’t a product launch. It was a declaration of what comes next. ...

Samsung and SK Hynix battle over HBM4 memory chips at GTC 2026

The AI Memory War: Samsung and SK Hynix Battle for Nvidia's Trillion-Dollar Future

Everyone’s talking about Nvidia’s new chips. But there’s a quieter, arguably more important war happening underneath all the GTC 2026 keynote spectacle — and it’s being fought by two Korean companies most people can’t tell apart. Samsung Electronics and SK hynix are locked in an increasingly fierce battle to supply the memory chips that make AI possible. Without high-bandwidth memory (HBM), Nvidia’s fancy GPUs are just expensive paperweights. And at GTC 2026 this week, both companies showed up swinging. ...

Abstract visualization of NVIDIA's GTC 2026 inference architecture

NVIDIA GTC 2026: The AI Chip Giant Just Rewrote the Rules of Inference

Thirty thousand people just descended on San Jose for what might be the most important tech keynote of 2026. NVIDIA’s GPU Technology Conference kicks off today, and Jensen Huang has promised to “surprise the world.” He’s not bluffing. The Training Era Is Over. Welcome to Inference. For three years, the AI hardware playbook was brain-dead simple: buy GPUs, train bigger models, repeat. NVIDIA rode that formula to a $4.4 trillion market cap — the most valuable public company on Earth. ...

Abstract visualization of cracking AI infrastructure

The AI Bubble Is Cracking — And Data Centers Are Ground Zero

The $500 billion Stargate project was supposed to be the physical backbone of the AI revolution. Instead, it’s becoming a cautionary tale about what happens when chips evolve faster than concrete can cure. Two Mega-Deals, Two Months, Two Collapses OpenAI and Oracle just scrapped plans to expand their flagship data center in Abilene, Texas. Oracle had already spent billions on hardware, secured land, hired staff, and started construction on a 600-megawatt expansion. Then OpenAI walked. ...

Meta custom AI chips challenging Nvidia dominance

Meta Just Dropped Four Custom AI Chips — And Nvidia Should Be Nervous

Meta just did something that should make Jensen Huang lose sleep. The company announced four new custom AI chips — the MTIA 300, 400, 450, and 500 — all shipping by late 2027. One is already in production. The rest arrive every six months. Six months. Most chip development cycles take one to two years. Meta is moving at double speed, and the message couldn’t be louder: total Nvidia dependence is over. ...

Nvidia's $26 billion investment in open-weight AI models

Nvidia's $26 Billion Gambit: Why the Chip Giant Is Building Open AI Models

Nvidia doesn’t just want to sell you the shovels anymore. It wants to dig the gold too. Buried inside a financial filing and confirmed by executives in interviews with WIRED, Nvidia plans to spend $26 billion over five years building open-weight AI models. To prove this isn’t vaporware, they simultaneously dropped Nemotron 3 Super — a 128-billion-parameter beast with a hybrid Mamba-Transformer architecture that’s already topping agentic AI benchmarks. This is the most strategically significant move in AI since Meta released the original Llama. Here’s why it matters. ...