Database Query Optimizer: The Gap Between Ideal and Reality

2025-07-04

Database query optimizers aim to select the optimal query plan, but their reliance on cost estimations, which in turn depend on selectivity and the cost of basic resources (I/O, CPU, etc.), often leads to errors. Experiments reveal that for simple SELECT queries, the accuracy of the optimizer's plan selection varies greatly depending on data distribution. With uniform datasets, bitmap scans generally outperform index scans; however, with other distributions, the optimizer is more prone to selecting suboptimal index scans. This demonstrates that even for simple queries, the optimizer's cost model struggles to perfectly adapt to diverse data distributions and hardware environments. While cost-based planning remains the best approach, improving its robustness and adaptability remains a significant challenge.

Development