Floating-Point Hell: Why Your R Multivariate Normal Sampling Isn't Reproducible

This post details the author's debugging journey helping colleagues resolve a reproducibility issue in their R code involving multivariate normal distribution sampling. The problem stemmed not from bugs in R or the MASS package, but from the inherent quirks of floating-point arithmetic. Despite using `set.seed()` to control the random number generator (RNG), the same code produced different results on different machines due to floating-point rounding errors in `MASS::mvrnorm()`. A deep dive revealed that `MASS::mvrnorm()`, using eigendecomposition, is highly sensitive to tiny input perturbations, potentially flipping eigenvector signs and breaking reproducibility. `mvtnorm::rmvnorm()`, employing Cholesky decomposition, proves more robust. The author recommends using `mvtnorm::rmvnorm()` with `method = "chol"` for improved reproducibility.
Read more