Compiler Optimization: Improving Lemire's Nearly Divisionless Random Number Generation

2025-03-09

The author improved a nearly divisionless algorithm for generating bounded random numbers (Lemire's algorithm). A previous version reduced code bloat by inlining the fast path, but compiler optimization was limited. The author discovered that when the limit is a compile-time constant, the rejection threshold can be precomputed, and division avoidance is unnecessary. The new implementation has only one call to the random number generator, and the compiler automatically eliminates the loop when the limit is a power of two. This is more efficient than last year's version, and the author explores similar compile-time optimization techniques in Rust.