Database Query Engines: Push vs. Pull

2025-04-16

This article delves into the differences between push-based and pull-based query engines in databases. Pull-based systems, akin to the iterator model, are consumer-driven, where data is retrieved on demand. Push-based systems, conversely, are producer-driven, actively pushing data to downstream operators. Push-based systems excel at handling DAG-shaped query plans (e.g., SQL's WITH clause) due to their ability to efficiently push data to multiple downstream operators, avoiding redundant computations and unnecessary buffering. However, pull-based systems offer advantages when handling certain algorithms (like merge joins and LIMIT clauses). The article further examines cache efficiency, code simplicity, and the suitability of each model in different scenarios, concluding that neither is universally superior, with the choice depending on specific requirements.

Development query engine push-pull