Emulating FMAdd: A Deep Dive into 32-bit Floats

2025-01-02

This post delves into emulating the FMAdd (fused multiply-add) instruction on hardware lacking native support, specifically focusing on a 32-bit float SIMD implementation. It explains FMAdd's operation and how to avoid double rounding errors inherent in intermediate floating-point calculations. The author details a clever technique using 'rounding to odd' and the extra precision of double-precision floats to eliminate rounding errors, achieving accurate FMAdd results. The post also briefly covers calculating precise addition results and error terms, promising a follow-up on handling 64-bit floats.