With the formal launch of the MI300 GPU, AMD revealed new plans for scaling the multi-GPU interconnects vital to AI-training performance. The company's approach relies on a partner ecosystem, which stands in stark contrast with NVIDIA's end-to-end solutions. The plans revolve around AMD's proprietary Infinity Fabric and its underlying XGMI interconnect. Infinity Fabric Adopts Switching As with its prior generation, AMD uses XGMI to connect multiple MI300 GPUs in what it calls a hive. The hive shares a homogeneous memory space formed by the HBM attached to each GPU. In current designs, the GPUs connect directly using XGMI in a mesh or ring topology. Each MI300X GPU has up to seven Infinity Fabric links, each with 16 lanes. The 4th-gen Infinity Fabric supports up to 32Gbps per lane, yielding 128GB/s of bidirectional bandwidth per link. At the MI300 launch, Broadcom announced that its next-generation PCI Express (PCIe) switch chip will add support for XGMI. At last October...
Yes, $11B in Blackwell revenue is impressive. Yes, Nvidia's data-center revenue grew 93% year over year. Under the surface, however, there's trouble in networking. In the January quarter (Q4 FY25), networking revenue declined 9% year over year and 3% sequentially. In its earnings call, CFO Collette Kress said that Nvidia's networking attach rate was "robust" at more than 75%. Her very next sentence, however, hinted at what's happening underneath that supposed robustness. "We are transitioning from small NVLink8 with InfiniBand to large NVLink72 with Spectrum-X," said Kress. About one year ago, Nvidia positioned InfiniBand for "AI factories" and Spectrum-X for multi-tenant clouds. That positioning collapsed when the company revealed xAI selected Spectrum-X for what is clearly an AI factory. InfiniBand appears to be retreating to its legacy HPC market while Ethernet comes to the fore. Nvidia Data-Center Revenue So how do we square 93% DC grow...
Powering SmartNICs, the data-processing unit (DPU) has become nearly ubiquitous in the leading public clouds. Existing designs maximize power efficiency for a constrained feature set, and they require proprietary software tools. Xsight Labs aims to break this paradigm with its new E1 DPU, which promises the openness of an Arm server CPU. Xsight Labs sponsored the creation of this white paper, but the opinions and analysis are those of the author. Download the white paper for free, no registration required. Xsight E1 DPU
Comments
Post a Comment