The Byrne-Wheeler Report Episode 4 discusses the RISC-V Summit NA, BlueField-4, SambaNova, and AWS Rainier. You can skip to 12:12 if you're uninterested in RISC-V.
At GTC DC last month, Jensen Huang showed off components of the Vera Rubin NVL144 platform. First, here's the latest roadmap, which now includes BlueField-4 and BlueField-5. For more on that, see BWR Episode 4 . Source: Nvidia Below is the Vera Rubin compute tray, which includes four Rubin GPUs. By GPU, we mean package not die. Note that the Blackwell NVL72 and Rubin NVL144 both have 72 GPU packages, but the NVL144 moniker denotes Nvidia's new math counting die. The company didn't rename the Blackwell configuration, even though that GPU also has two die. Each compute tray has two Vera CPUs, which are 88-core Arm processors. Two GPUs connect with one CPU using NVLink-C2C, a coherent variant of NVLink. Although the roadmap above shows CX9 as 1600G, each ConnectX-9 is actually 800Gbps, requiring eight chips to deliver the aggregate 800GB/s quoted for the tray. That means each GPU has a pair of 800G Ethernet/InfiniBand NICs for scale-out networking. Finally, a single BlueField...
With the formal launch of the MI300 GPU, AMD revealed new plans for scaling the multi-GPU interconnects vital to AI-training performance. The company's approach relies on a partner ecosystem, which stands in stark contrast with NVIDIA's end-to-end solutions. The plans revolve around AMD's proprietary Infinity Fabric and its underlying XGMI interconnect. Infinity Fabric Adopts Switching As with its prior generation, AMD uses XGMI to connect multiple MI300 GPUs in what it calls a hive. The hive shares a homogeneous memory space formed by the HBM attached to each GPU. In current designs, the GPUs connect directly using XGMI in a mesh or ring topology. Each MI300X GPU has up to seven Infinity Fabric links, each with 16 lanes. The 4th-gen Infinity Fabric supports up to 32Gbps per lane, yielding 128GB/s of bidirectional bandwidth per link. At the MI300 launch, Broadcom announced that its next-generation PCI Express (PCIe) switch chip will add support for XGMI. At last October...
Source: Broadcom Tomahawk Ultra is a misnomer. Although the name leverages Tomahawk's brand equity, Tomahawk Ultra represents a new architecture. In fact, when it began development, Broadcom's competitive target was InfiniBand. During development, however, AI scale-up interconnects emerged as a critical component of performance scaling, particularly for large language models (LLMs). Through luck or foresight, Tomahawk Ultra suddenly had a new and fast-growing target market. Now, the leading competitor was NVIDIA's NVLink. Also happening in parallel, Broadcom built a multi-billion-dollar business in custom AI accelerators for hyperscalers, most notably Google. At the end of April, Broadcom announced its Scale-Up Ethernet (SUE) framework, which it published and contributed to the Open Compute Project (OCP). Appendix A of the framework includes a latency budget, which allocates less than 250ns to the switch. At the time, we saw this as an impossibly low target for existing Eth...
Comments
Post a Comment