Home EnterpriseAI Japan Integrating Thousands of NVIDIA H200s into AI Bridging Cloud Infrastructure 3.0

Japan Integrating Thousands of NVIDIA H200s into AI Bridging Cloud Infrastructure 3.0

by Jordan Ranous

NVIDIA will integrate thousands of H200 Tensor Core GPUs into Japan’s AI Bridging Cloud Infrastructure 3.0 (ABCI 3.0). HPE Cray XD systems with NVIDIA Quantum-2 InfiniBand networking will provide the performance and scalability ABCI demands.

Japan is about to make significant advancements in AI research and development by integrating thousands of NVIDIA H200 Tensor Core GPUs into its AI Bridging Cloud Infrastructure 3.0 (ABCI 3.0). Spearheaded by the National Institute of Advanced Industrial Science and Technology (AIST), this integration will see the HPE Cray XD system equipped with NVIDIA Quantum-2 InfiniBand networking, which promises superior performance and scalability.

ABCI 3.0: Advancing AI Research and Development

ABCI 3.0 represents the latest iteration of Japan’s large-scale Open AI Computing Infrastructure, designed to propel AI R&D forward. This development highlights Japan’s drive to enhance AI capabilities and technological independence. Since the launch of the original ABCI in August 2018, AIST has accumulated significant experience in managing large-scale AI infrastructures. Building on this foundation, the ABCI 3.0 upgrade, in collaboration with NVIDIA and HPE, aims to elevate Japan’s generative AI research and development capabilities.

It codifies your culture, your society’s intelligence, your common sense, your history – you own your own data – Jensen Huang President and CEO of NVIDIA

The ABCI 3.0 supercomputer will be located in Kashiwa.
Image Courtesy of the National Institute of Advanced Industrial Science and Technology.

The ABCI 3.0 project is a collaborative effort involving AIST, its business subsidiary AIST Solutions, and Hewlett Packard Enterprise (HPE) as the system integrator. This initiative is supported by Japan’s Ministry of Economy, Trade, and Industry (METI) through the Economic Security Fund. It forms part of METI’s broader $1 billion initiative to bolster computing resources and invest in cloud AI computing. NVIDIA’s involvement is significant, with the company pledging to support research in generative AI, robotics, and quantum computing and to invest in AI startups while providing extensive product support, training, and education.

NVIDIA Commits to Japan

NVIDIA’s collaboration with METI on AI research and education follows a visit by CEO Jensen Huang, who emphasized the critical role of “AI factories” — next-generation data centers designed for intensive AI tasks — in transforming vast amounts of data into actionable intelligence. Huang’s commitment to supporting Japan’s AI ambitions aligns with his vision of AI factories becoming the bedrock of modern economies globally.

The AI factory will become the bedrock of modern economies across the world – Jensen Huang President and CEO of NVIDIA

ABCI 3.0, with its ultra-high-density data center and energy-efficient design, will provide a robust infrastructure for developing AI and big data applications. The system, expected to be operational by the end of the year and housed in Kashiwa near Tokyo, will offer state-of-the-art AI research and development resources.

Unmatched Performance and Efficiency

The ABCI 3.0 facility will deliver 6 AI exaflops of computing capacity, a measure of AI-specific performance without sparsity, and 410 double-precision petaflops for general computing capacity. Each node will be connected via the Quantum-2 InfiniBand platform, offering 200GB/s of bisectional bandwidth. NVIDIA technology forms the core of this initiative, with hundreds of nodes equipped with 8 NVLink-connected H200 GPUs, ensuring unparalleled computational performance and efficiency.

The NVIDIA H200 GPU is a groundbreaking component, offering over 140 gigabytes (GB) of HBM3e memory at 4.8 terabytes per second (TB/s).  NVIDIA claims a 15x improvement in energy efficiency over ABCI’s previous generation platform for AI workloads. This larger and faster memory significantly accelerates generative AI and large language models (LLMs), advancing scientific computing for high-performance computing (HPC) workloads with improved energy efficiency and lower total cost of ownership.

Quantum-2 InfiniBand Advanced Networking 

Integrating NVIDIA Quantum-2 InfiniBand with In-Network computing capabilities allows networking devices to perform computations on data, offloading work from the CPU. This ensures efficient, high-speed, and low-latency communication, essential for managing intensive AI workloads and large datasets.

ABCI’s world-class computing and data processing capabilities will accelerate joint AI R&D efforts among industries, academia, and governments. METI’s substantial investment in this project highlights Japan’s strategic vision to enhance AI development capabilities and accelerate the use of generative AI. By subsidizing AI supercomputer development, Japan aims to reduce the time and costs associated with developing next-generation AI technologies, thereby positioning itself as a leader in the global AI landscape.

Engage with StorageReview

Newsletter | YouTube | Podcast iTunes/Spotify | Instagram | Twitter | TikTok | RSS Feed