TarFlow: Transformer-based Normalizing Flows Achieve SOTA Image Likelihood Estimation

2025-06-28
TarFlow: Transformer-based Normalizing Flows Achieve SOTA Image Likelihood Estimation

Researchers introduce TarFlow, a novel normalizing flow model leveraging Transformers and masked autoregressive flows. TarFlow efficiently estimates density and generates images by processing image patches with autoregressive Transformer blocks, alternating the autoregression direction between layers. Three key techniques boost sample quality: Gaussian noise augmentation during training, post-training denoising, and an effective guidance method for both class-conditional and unconditional generation. TarFlow achieves state-of-the-art results in image likelihood estimation, significantly outperforming previous methods and generating samples comparable in quality and diversity to diffusion models—a first for a standalone normalizing flow model.

Read more
AI

Large Reasoning Models: Collapse and Counterintuitive Scaling

2025-06-08
Large Reasoning Models: Collapse and Counterintuitive Scaling

Recent Large Language Models (LLMs) have spawned Large Reasoning Models (LRMs), generating detailed reasoning traces before providing answers. While showing improvement on reasoning benchmarks, their fundamental capabilities remain poorly understood. This work investigates LRMs using controllable puzzle environments, revealing a complete accuracy collapse beyond a certain complexity threshold. Surprisingly, reasoning effort increases with complexity, then declines despite sufficient token budget. Compared to standard LLMs, three regimes emerged: (1) low-complexity tasks where standard LLMs outperform LRMs, (2) medium-complexity tasks where LRMs show an advantage, and (3) high-complexity tasks where both fail. LRMs exhibit limitations in exact computation, failing to use explicit algorithms and reasoning inconsistently. This study highlights the strengths, limitations, and crucial questions surrounding the true reasoning capabilities of LRMs.

Read more
AI

Apple's Privacy-Preserving Approach to AI Improvement

2025-04-14
Apple's Privacy-Preserving Approach to AI Improvement

Apple is committed to user privacy, even while improving its AI features like Genmoji, image generation tools, and writing tools. They employ differential privacy, anonymizing user data to collect only aggregated trend information, such as popular Genmoji prompts. For AI features handling longer text like emails, Apple uses synthetic data. This generates synthetic data mimicking real user data patterns for model training and testing without accessing actual email content. This allows Apple to enhance product experiences while ensuring user privacy remains paramount.

Read more

Apple's New AI Breakthrough: Fine-Grained Control of Generative Models with Activation Transport (AcT)

2025-04-10
Apple's New AI Breakthrough: Fine-Grained Control of Generative Models with Activation Transport (AcT)

Apple machine learning researchers have developed Activation Transport (AcT), a novel technique offering fine-grained control over large generative models, including LLMs and text-to-image diffusion models, without the resource-intensive training of RLHF or fine-tuning. AcT steers model activations using optimal transport theory, achieving modality-agnostic control with minimal computational overhead. Experiments demonstrate significant improvements in toxicity mitigation, truthfulness induction in LLMs, and stylistic control in image generation. AcT paves the way for safer and more reliable generative models.

Read more

SeedLM: A Novel LLM Weight Compression Method Using Pseudo-Random Number Generators

2025-04-06
SeedLM: A Novel LLM Weight Compression Method Using Pseudo-Random Number Generators

Large Language Models (LLMs) are hindered by high runtime costs, limiting widespread deployment. Meta researchers introduce SeedLM, a novel post-training compression method using seeds from a pseudo-random number generator to encode and compress model weights. During inference, SeedLM uses a Linear Feedback Shift Register (LFSR) to efficiently generate a random matrix, linearly combined with compressed coefficients to reconstruct weight blocks. This reduces memory access and leverages idle compute cycles, speeding up memory-bound tasks by trading compute for fewer memory accesses. Unlike state-of-the-art methods requiring calibration data, SeedLM is data-free and generalizes well across diverse tasks. Experiments on the challenging Llama 3 70B show zero-shot accuracy at 4- and 3-bit compression matching or exceeding state-of-the-art methods, while maintaining performance comparable to FP16 baselines. FPGA tests demonstrate that 4-bit SeedLM approaches a 4x speed-up over an FP16 Llama 2/3 baseline as model size increases.

Read more
AI