Deep Dive into AMD's Instinct MI350: GCN-Based AI Accelerator
In an interview, Alan Smith, AMD's Chief Instinct Architect, delved into the details of the new MI350 series AI accelerators, based on the GFX9 architecture. While MI350 retains the GFX9 architecture, significant performance improvements are achieved through increased LDS capacity (160KB) and bandwidth, along with the introduction of microscaling formats supporting FP8, FP6, and FP4 data types. Notably, MI350's FP6 and FP4 boast the same throughput, reflecting AMD's confidence in FP6's potential for both training and inference. Furthermore, MI350 omits TF32 hardware acceleration in favor of optimized BF16, offering software emulation for TF32 support. Built with N3P process compute chips and N6 process I/O chips, MI350 optimizes design and reduces compute units to achieve high performance while lowering power consumption.