Google Ironwood TPU: A Massive Leap in AI Inference Performance

by Divyansh Jain April 28, 2025

written by Divyansh Jain April 28, 2025

Google unveils the Ironwood TPU, its most powerful AI accelerator yet, delivering massive improvements in inference performance and efficiency.

Last week, Google pulled back the curtain on its latest custom AI accelerator, the Ironwood TPU, showcasing a significant performance improvement for the increasingly demanding world of AI. Announced at Google Cloud Next 25, Ironwood is the seventh generation of Google’s TPUs, specifically engineered to handle the modern AI workloads, particularly in the realm of inference.

Ironwood TPU

Understanding TPUs

Before diving into Ironwood, it’s helpful to understand what TPUs are. Tensor Processing Units are specialized chips developed by Google specifically for accelerating machine learning workloads. Unlike general-purpose CPUs or even GPUs, which are optimized for parallel processing, initially for graphics, TPUs are optimized for the matrix and tensor operations at the heart of neural networks. Historically, Google has offered different TPU versions, often distinguishing between ‘e’ series (focused on efficiency and inference, running pre-trained models) and ‘p’ series (focused on raw performance for training large models).

Introducing Ironwood

The new Ironwood TPU is Google’s most ambitious AI accelerator to date. It’s the company’s first TPU specifically designed for the demands of inference-heavy ‘reasoning models’. Ironwood brings substantial improvements across all key performance metrics compared to its predecessors, including:

	TPU v5e	TPU v5p	TPU v6e	TPU v7e
BF16 Compute	197 TFLOPs	459 TFLOPs	918 TFLOPs	2.3 PFLOPs*
INT8/FP8 Compute	394 TOPs/TFLOPs*	918 TOPs/TFLOPs*	1836 TOPs/TFLOPs	4.6 POPs/PFLOPs
HBM Bandwidth	0.8 TB/s	2.8 TB/s	1.6 TB/s	7.4 TB/s
HBM Capacity	16 GB	95 GB	32 GB	192 GB
Inter Chip Interconnect Bandwidth (per link)	400 Gbps	800 Gbps	800 Gbps	1200 Gbps
Interconnect Topology	2D Torus	3D Torus	2D Torus	3D Torus
TPU Pod Size	256	8960	256	9216
Spare Cores	No	No	Yes	Yes

Note: Numbers marked with “*” are unofficial calculated numbers.

Most notably, Ironwood features:

Massive computational power: Each chip delivers 4.6 petaFLOPS of FP8 performance, putting it in the same performance class as NVIDIA’s Blackwell B200
Increased memory capacity: 192GB of High Bandwidth Memory (HBM) per chip
Dramatically improved memory bandwidth: 7.37 TB/s per chip, 4.5x more than Trillium, enabling faster data access for memory-constrained AI Inference
Enhanced interconnect capabilities: 1.2 TBps bidirectional bandwidth, a 1.5x improvement over Trillium, facilitating more efficient communication between chips

Speculation: Is Ironwood the Missing v6p?

Interestingly, Google appears to have skipped the expected TPU v6p generation and moved directly to releasing the v7e Ironwood. This suggests that this chip may have been originally intended as the v6p training chip. However, due to rapidly expanding model sizes and the need to compete with offerings like NVIDIA’s GB200 NVL72, Google likely repositioned it as the v7e Ironwood. The massive 9216 TPU pod size and the use of 3D Torus interconnect in what is designated as an “e” series chip (typically the more economical variant) strongly support this theory.

The Road Ahead

Google has announced that Ironwood TPUs will be available later this year through Google Cloud. The technology is already powering some of Google’s most advanced AI systems, including Gemini 2.5 and AlphaFold.

As these powerful new accelerators become available to developers and researchers, they are likely to enable breakthroughs in AI capabilities, particularly for large-scale inference workloads that require both massive computational power and sophisticated reasoning capabilities.

Engage with StorageReview

Divyansh Jain

MLOps and Machine Learning Engineer focused in NLP and large-scale training. At Storage Review, I deal with AI, GPU, and emerging workload testing to deliver practical insights and performance analytics.

Google Ironwood TPU: A Massive Leap in AI Inference Performance

Understanding TPUs

Introducing Ironwood

Speculation: Is Ironwood the Missing v6p?

The Road Ahead

Divyansh Jain

Lenovo Unveils Comprehensive AI-Optimized Data Storage Portfolio

HPE Modernizes Edge-to-Cloud Security with New Aruba and GreenLake Updates

TRUSTED VENDORS