The Byrne-Wheeler Report Episode 4 discusses the RISC-V Summit NA, BlueField-4, SambaNova, and AWS Rainier. You can skip to 12:12 if you're uninterested in RISC-V.
Yes, $11B in Blackwell revenue is impressive. Yes, Nvidia's data-center revenue grew 93% year over year. Under the surface, however, there's trouble in networking. In the January quarter (Q4 FY25), networking revenue declined 9% year over year and 3% sequentially. In its earnings call, CFO Collette Kress said that Nvidia's networking attach rate was "robust" at more than 75%. Her very next sentence, however, hinted at what's happening underneath that supposed robustness. "We are transitioning from small NVLink8 with InfiniBand to large NVLink72 with Spectrum-X," said Kress. About one year ago, Nvidia positioned InfiniBand for "AI factories" and Spectrum-X for multi-tenant clouds. That positioning collapsed when the company revealed xAI selected Spectrum-X for what is clearly an AI factory. InfiniBand appears to be retreating to its legacy HPC market while Ethernet comes to the fore. Nvidia Data-Center Revenue So how do we square 93% DC grow...
With the formal launch of the MI300 GPU, AMD revealed new plans for scaling the multi-GPU interconnects vital to AI-training performance. The company's approach relies on a partner ecosystem, which stands in stark contrast with NVIDIA's end-to-end solutions. The plans revolve around AMD's proprietary Infinity Fabric and its underlying XGMI interconnect. Infinity Fabric Adopts Switching As with its prior generation, AMD uses XGMI to connect multiple MI300 GPUs in what it calls a hive. The hive shares a homogeneous memory space formed by the HBM attached to each GPU. In current designs, the GPUs connect directly using XGMI in a mesh or ring topology. Each MI300X GPU has up to seven Infinity Fabric links, each with 16 lanes. The 4th-gen Infinity Fabric supports up to 32Gbps per lane, yielding 128GB/s of bidirectional bandwidth per link. At the MI300 launch, Broadcom announced that its next-generation PCI Express (PCIe) switch chip will add support for XGMI. At last October...
Source: Broadcom Tomahawk Ultra is a misnomer. Although the name leverages Tomahawk's brand equity, Tomahawk Ultra represents a new architecture. In fact, when it began development, Broadcom's competitive target was InfiniBand. During development, however, AI scale-up interconnects emerged as a critical component of performance scaling, particularly for large language models (LLMs). Through luck or foresight, Tomahawk Ultra suddenly had a new and fast-growing target market. Now, the leading competitor was NVIDIA's NVLink. Also happening in parallel, Broadcom built a multi-billion-dollar business in custom AI accelerators for hyperscalers, most notably Google. At the end of April, Broadcom announced its Scale-Up Ethernet (SUE) framework, which it published and contributed to the Open Compute Project (OCP). Appendix A of the framework includes a latency budget, which allocates less than 250ns to the switch. At the time, we saw this as an impossibly low target for existing Eth...
Comments
Post a Comment