Apple Challenges Diffusion Models: A Breakthrough in Image Generation with Normalizing Flows

Apple released two papers showcasing the potential of a forgotten image generation technique: Normalizing Flows. Their new models, TarFlow and STARFlow, leverage Transformers to achieve significant advancements in image quality and efficiency. Unlike OpenAI's GPT-4o, which generates images token by token, Apple's models generate pixel values directly or through a compression-decompression process, avoiding information loss from tokenization and offering better control over image details. STARFlow further improves by employing latent space generation and integrating a lightweight language model, making it more suitable for mobile devices. This marks a new direction in image generation, challenging the dominance of diffusion models.