Posts

Showing posts with the label interconnects

AI Unsurprisingly Dominates Hot Chips 2024

Image
This year's edition of the annual Hot Chips conference represented the peak in the generative-AI hype cycle. Consistent with the theme, OpenAI's Trevor Cai made the bull case for AI compute in his keynote. At a conference known for technical disclosures, however, the presentations from merchant chip vendors were disappointing; despite a great lineup of talks, few new details emerged. Nvidia's Blackwell presentation mostly rehashed previously disclosed information. In a picture-is-worth-a-thousand-words moment, however, one slide included the photo of the GB200 NVL36 rack shown below. GB200 NVL36 rack (Source: Nvidia) Many customers prefer the NVL36 over the power-hungry NVL72 configuration, which requires a massive 120kW per rack. The key difference for our readers is that the NVLink switch trays shown in the middle of the rack have front-panel cages, whereas the "non-scalable" NVLink switch tray used in the NVL72 has only back-panel connectors for the NVLink spin...

NVIDIA Reveals Roadmap at Computex

Image
The annual Computex trade show in Taipei has traditionally been PC-centric, with ODMs showing their latest motherboards and systems. The 2024 event, however, included keynotes from Nvidia and others that revealed details of forthcoming datacenter GPUs, demonstrating the importance of the ODM ecosystem to the explosion of AI. The fact that Jensen Huang was born on the island made his keynote all the more impactful for the local audience. In the week following the CEO's keynote, Nvidia's market capitalization surpassed $3 trillion. From a networking perspective, the keynote focused on Ethernet rather than InfiniBand, as the former is a better fit in the ecosystem messaging. Source: NVIDIA The datacenter section of Jensen's talk largely reminded the audience of what Nvidia announced at GTC in March. The Blackwell GPU, now in production, introduces NVLink5, which operates at 200Gbps per lane. It includes 18 NVLink ports with two lanes each, or 36x200Gbps serdes. The new NVLink...

AMD Looks to Infinity for AI Interconnects

Image
With the formal launch of the MI300 GPU, AMD revealed new plans for scaling the multi-GPU interconnects vital to AI-training performance. The company's approach relies on a partner ecosystem, which stands in stark contrast with NVIDIA's end-to-end solutions. The plans revolve around AMD's proprietary Infinity Fabric and its underlying XGMI interconnect. Infinity Fabric Adopts Switching As with its prior generation, AMD uses XGMI to connect multiple MI300 GPUs in what it calls a hive. The hive shares a homogeneous memory space formed by the HBM attached to each GPU. In current designs, the GPUs connect directly using XGMI in a mesh or ring topology. Each MI300X GPU has up to seven Infinity Fabric links, each with 16 lanes. The 4th-gen Infinity Fabric supports up to 32Gbps per lane, yielding 128GB/s of bidirectional bandwidth per link. At the MI300 launch, Broadcom announced that its next-generation PCI Express (PCIe) switch chip will add support for XGMI. At last October...

Optics Grab Attention at Hot Chips 2023

Image
August marked the in-person return of the   Hot Chips   conference at Stanford University in California, and the sold-out 35th edition included plenty of deep technical content. AI/ML garnered lots of attention and optical interconnects were featured in both chip- and system-level AI and HPC talks.   NVIDIA’s   chief scientist, Bill Dally, keynoted Day 2 with a talk reviewing how accelerators achieved a 1,000x performance increase over the last 10 years. His big-picture view provided excellent context for AI-system design, but networking received only an honorable mention this year. Instead, Dally discussed future directions for accelerated compute. Following the keynote, an  ML-Training  session presented talks from Google and Cerebras. The technical lead for TPUs at  Google , Norm Jouppi made it clear he could only discuss the n-1 generation, meaning TPUv4. Meanwhile, Google revealed the TPUv5e at its own Google Cloud Next event the same day but...

Preview: Hot Chips Returns to Stanford for HC35

Image
I will be at Stanford at the end of this month for the in-person return of Hot Chips . As always, the 35th edition (HC35) will have plenty of deep technical content, with AI/ML unsurprisingly getting lots of attention. I'm particularly interested in a set of talks exploring interconnects and networking for AI, HPC, and beyond. Day 2 (Tuesday, August 29) features an ML-Training session with talks from Google and Cerebras. The technical lead for TPUs at Google , Norm Jouppi will expand on the paper presented at ISCA 2023 describing the TPUv4 supercomputer. That paper revealed Google's use of optical circuit switches in its TPUv4 cluster, following prior disclosures around OCS deployments in its data-center spine layer. Sean Li, cofounder and chief hardware architect at Cerebras , will deliver a talk on the company's cluster architecture built around the CS-2 system and WSE-2 wafer-scale engine. This talk will explore how the MemoryX external-memory system and SwarmX fabric i...