Groq 3 LPU: NVIDIA Just Unveiled a $20 Billion Chip That Isn’t a GPU

By Jake Hoffman
3 months Ago

Groq 3 LPU: Nvidia Just Unveiled a $20 Billion Chip That Isn't a GPU

At GTC 2026, Jensen Huang didn’t just announce another GPU. He unveiled the Groq 3 LPU, a fundamentally different kind of AI chip built from the $20 billion acquisition of startup Groq. The Language Processing Unit isn’t faster at training models. It’s faster at running them. And that distinction is about to redefine who controls the AI economy.

The Core Story: What Is the Groq 3 LPU?

NVIDIA debuted the Groq 3 Language Processing Unit at its annual GTC conference on March 16, 2026. The chip is the first product built on intellectual property acquired when Nvidia purchased AI chip startup Groq for $20 billion on Christmas Eve 2025, Nvidia’s largest acquisition in history, as reported by Tom’s Hardware and IEEE Spectrum.

The Groq 3 LPU is not a GPU. Where GPUs excel at the parallel mathematical operations required to train AI models, the LPU is purpose-built for inference, the process of actually running trained models to generate responses, images, code, and decisions. Its architecture replaces traditional high-bandwidth memory with SRAM integrated directly onto the processor, achieving 150 terabytes per second, seven times faster than Nvidia’s own Rubin GPU at 22 TB/s.

NVIDIA also launched the Groq 3 LPX platform, a server rack containing 128 LPUs. When paired with Nvidia’s Vera Rubin NVL72 GPU rack, the company claims 35x higher throughput per megawatt of power and a target of 1,500 tokens per second for agentic AI communications, according to SiliconANGLE.

Context & Global Impact: Why Inference Is the Real AI Market

Training was the first act. Inference is the main event. Training a large language model is a one-time cost. Inference: running that model billions of times per day is continuous and escalating. Morgan Stanley estimates that by 2028, AI inference compute demand will exceed training demand by 10 to 1.
Nvidia is locking in a monopoly before competitors arrive. AMD, Intel, Cerebras, and SambaNova are all building inference chips. By acquiring Groq, Nvidia combines GPU training dominance with inference dominance. No competitor offers a complete training-to-inference platform.
Agentic AI demands this chip. The 1,500 tokens-per-second target enables multi-agent AI systems to communicate in real time. Current GPU inference is too slow for these workflows. The LPU’s SRAM architecture eliminates the memory bottleneck that creates latency.
The $20 billion price tag now looks cheap. Groq was valued at $2.8 billion before the acquisition. NVIDIA paid a 7x premium. Post-GTC, with 35x throughput improvements demonstrated, the acquisition looks like Nvidia buying the inference market before anyone realized it was for sale.

The Energy Equation

The “35x throughput per megawatt” metric may matter more than raw speed. AI’s energy crisis is constraining data center expansion globally. A chip that delivers 35x more AI output per unit of electricity makes previously impossible deployments possible. An AI workload requiring 35 megawatts on GPUs alone could theoretically run on 1 megawatt with an LPU-GPU combination.

What This Means for AI Costs

If the Groq 3 delivers on its claims, inference costs drop dramatically. That creates a flywheel: cheaper inference means more AI-powered products, which means more inference demand, which means more LPU sales. NVIDIA is building the same lock-in that CUDA created for GPU computing, but for a market projected to be an order of magnitude larger.

What’s Next: The Inference Arms Race Begins

The Groq 3 ships in late 2026. AMD is expected to respond at Computex in June. Intel’s Gaudi 4 is in development. But Nvidia has the software ecosystem advantage: the Groq 3 integrates with Nvidia’s NIM inference software stack, designed to make the LPU the default choice. The AI chip war just opened a second front.

Frequently Asked Questions

What is the Nvidia Groq 3 LPU? A new AI chip purpose-built for inference, using SRAM memory to achieve 150 TB/s bandwidth, 7x faster than Nvidia’s Rubin GPU. Unveiled at GTC 2026.

How is an LPU different from a GPU? GPUs excel at training AI models. LPUs are optimized for running trained models at scale, generating text, images, and powering AI agents in real time with ultra-low latency.

Why did Nvidia pay $20 billion for Groq? To dominate AI inference, projected to be 10x larger than training by 2028. The Groq 3 delivers 35x more throughput per megawatt, giving Nvidia a complete training-to-inference platform.

Categories: Technology
Tags: Featured Groq 3 LPU Nvidia

The Core Story: What Is the Groq 3 LPU?

Context & Global Impact: Why Inference Is the Real AI Market

The Energy Equation

What This Means for AI Costs

What’s Next: The Inference Arms Race Begins

Frequently Asked Questions

Related Content

AI Washing Is Everywhere: Why Exaggerated AI Claims Are Becoming The Tech Industry's Newest Deception

Five Eyes Warns AI-Powered Cyberattacks Could Arrive Within Months

Manual Prompts Are Fading As AI Shifts to Loop Engineering. Here's What It Means

Can AI Cure Loneliness? Why More People Are Befriending Bots and Even Marrying Holograms

Fact Check: No, Jeff Bezos Did Not Say Human Water Use Is Slowing AI Growth

Google DeepMind Loses Nobel-Winning VP John Jumper to Anthropic Amid AI Talent War