BD3-LMs: Block Discrete Denoising Diffusion Language Models – Faster, More Efficient Text Generation
2025-05-08

BD3-LMs cleverly combine autoregressive and diffusion model paradigms. By modeling blocks of tokens autoregressively and then applying diffusion within each block, it achieves both high likelihoods and flexible-length generation, while maintaining the speed and parallelization advantages of diffusion models. Efficient training and sampling algorithms, requiring only two forward passes, further enhance performance, making it a promising approach for large-scale text generation.
Read more