Improving LLM Fine-tuning Through Iterative Data Curation

2025-08-08
Improving LLM Fine-tuning Through Iterative Data Curation

Researchers significantly improved the performance of large language models (LLMs) by iteratively curating their training data. Experiments involved two LLMs of varying sizes (Gemini Nano-1 and Nano-2) on tasks of different complexity, using ~100K crowdsourced annotations initially suffering from severe class imbalance (95% benign). Through iterative expert curation and model fine-tuning, performance substantially increased. The models reached approximately 40% positive examples and a Cohen's Kappa of ~0.81 (lower complexity) and ~0.78 (higher complexity), approaching expert-level performance, highlighting the crucial role of high-quality data in LLM training.

Read more

Earthquake Early Warning: The Speed-Accuracy Tradeoff in Magnitude Estimation

2025-07-23
Earthquake Early Warning: The Speed-Accuracy Tradeoff in Magnitude Estimation

A major challenge in Earthquake Early Warning (EEW) systems is real-time estimation of earthquake magnitude. Magnitude determines the extent of shaking and who needs warning. Underestimation risks missed warnings, while overestimation leads to false alarms and erosion of public trust. The key challenge lies in balancing speed and accuracy; initial data is limited, but delaying alerts reduces warning time. Over the past three years, we've significantly improved magnitude estimation, reducing the median absolute error from 0.50 to 0.25. Our accuracy now rivals, and in some cases surpasses, established seismic networks.

Read more

MUVERA: Efficient Multi-Vector Retrieval

2025-06-26
MUVERA: Efficient Multi-Vector Retrieval

Modern information retrieval relies on neural embedding models, but while multi-vector models offer higher accuracy, their computational complexity leads to inefficiency. Researchers introduce MUVERA, a novel algorithm that transforms complex multi-vector retrieval into simpler single-vector maximum inner product search (MIPS) by constructing fixed dimensional encodings (FDEs). This significantly improves efficiency without sacrificing accuracy. The open-source implementation is available on GitHub.

Read more

Veo Gen 3: Generalizing Video Generation

2025-05-16
Veo Gen 3: Generalizing Video Generation

Google's latest breakthrough in video generation, Veo, now boasts a third generation capable of generalizing across diverse tasks. Trained on millions of high-quality 3D synthetic assets, Veo excels at novel view synthesis, transforming product images into consistent 360° videos. Importantly, this approach generalizes effectively across furniture, apparel, electronics, and more, accurately capturing complex lighting and material interactions—a significant improvement over previous generations.

Read more
AI

Google Boosts Developer Productivity with Hybrid Semantic ML Code Completion

2025-05-15
Google Boosts Developer Productivity with Hybrid Semantic ML Code Completion

Google researchers have developed a novel Transformer-based hybrid semantic machine learning code completion system that combines machine learning (ML) and rule-based semantic engines (SEs) to significantly improve developer productivity. The system integrates ML and SEs in three ways: 1) re-ranking SE's single-token suggestions using ML; 2) applying single and multi-line completions using ML and checking correctness with the SE; and 3) using single and multi-line continuation by ML of single-token semantic suggestions. A three-month study with 10,000+ Google internal developers showed a 6% reduction in coding iteration time with single-line ML completion. Currently, over 3% of new code is generated from accepting ML completion suggestions. The system supports eight programming languages and incorporates semantic checks to ensure code correctness, significantly boosting developer trust and efficiency.

Read more
Development

Whisper's Embeddings Surprisingly Align with Human Brain Activity During Speech

2025-03-26
Whisper's Embeddings Surprisingly Align with Human Brain Activity During Speech

A study reveals a surprising alignment between OpenAI's Whisper speech recognition model and the neural activity in the human brain during natural conversations. By comparing Whisper's embeddings to brain activity in regions like the inferior frontal gyrus (IFG) and superior temporal gyrus (STG), researchers found that language embeddings peaked before speech embeddings during speech production, and vice-versa during comprehension. This suggests Whisper, despite not being designed with brain mechanisms in mind, captures key aspects of language processing. The findings also highlight a 'soft hierarchy' in brain language processing: higher-order areas like the IFG prioritize semantic and syntactic information but also process lower-level auditory features, while lower-order areas like the STG prioritize acoustic and phonemic processing but also capture word-level information.

Read more
AI

Groundbreaking Research: The Power Team Behind the Success

2025-03-03
Groundbreaking Research: The Power Team Behind the Success

This paper is the result of a close collaboration with Asaf Aharoni, Avinatan Hassidim, and Danny Vainstein. The team also extends gratitude to dozens of individuals from Google Research, Google DeepMind, and Google Search, including YaGuang Li and Blake Hechtman, for their reviews, insightful discussions, valuable feedback, and support. Their contributions were crucial to the completion of this research.

Read more
AI

Google AI Breakthrough: A Giant Team Effort Revealed in Acknowledgements

2025-02-19
Google AI Breakthrough: A Giant Team Effort Revealed in Acknowledgements

This paper's acknowledgements reveal a massive collaborative effort involving numerous researchers from Google Research, Google DeepMind, and Google Cloud AI, along with collaborators from the Fleming Initiative, Imperial College London, Houston Methodist Hospital, Sequome, and Stanford University. The extensive list highlights the collaborative nature of the research and thanks many scientists who provided technical and expert feedback, as well as numerous Google internal teams providing support across product, engineering, and management. The sheer length of the acknowledgements underscores the massive team effort behind large-scale AI projects.

Read more
AI

OMG! Nearly All Binary Searches and Mergesorts Are Broken

2025-01-11
OMG! Nearly All Binary Searches and Mergesorts Are Broken

Google software engineer Joshua Bloch revealed a nearly two-decade-old bug lurking in binary search algorithms, found in both the JDK and Jon Bentley's 'Programming Pearls'! The bug stems from the line `int mid = (low + high) / 2;`, causing integer overflow and array index out-of-bounds exceptions when the sum of `low` and `high` exceeds the maximum positive integer value. This bug only manifests with massive datasets, making it particularly dangerous in today's big data world. The article explores various fixes and emphasizes that bugs can persist even with rigorous testing and proofs, urging programmers to remain cautious and humble.

Read more

Google Expands Global Solar Potential Assessment Using Satellite Imagery and Machine Learning

2024-12-19
Google Expands Global Solar Potential Assessment Using Satellite Imagery and Machine Learning

Google researchers have expanded the Google Maps Platform Solar API's coverage in the Global South by applying machine learning models to satellite imagery to generate high-resolution digital surface models and roof segmentation maps. This innovation overcomes limitations in traditional methods of data acquisition and processing, providing solar potential assessment data for 1.25 billion buildings globally and accelerating the adoption of renewable energy worldwide. The project leverages satellite data to increase data update frequency and reduce costs, particularly beneficial in data-scarce regions.

Read more