Clang Optimization Regression: Inlining Backfires in C++ Benchmark

2025-02-19

A C++ benchmark revealed a performance regression in Clang's optimization of inline functions. When the `increment` function was inlined, branch prediction failures resulted in roughly a 5x slowdown compared to the non-inlined version. `perf stat` confirmed branch mispredictions as the culprit. Interestingly, compiling with the Zig toolchain significantly improved performance, suggesting a potential regression in Clang 19. The issue has been reported on the Clang/LLVM repository, with initial investigation pointing to a trade-off between SROA and SimplifyCFG optimization passes.

Development