Static Search Trees: 40x Faster Than Binary Search

2025-01-01

This blog post details the implementation and optimization of a static search tree (S+ tree) for high-throughput searching of sorted data, achieving a 40x speedup over binary search. Starting with code from Algorithmica, the author meticulously optimizes the search algorithm through vectorization, SIMD instructions, and batching. Deep dives into assembly code reveal opportunities for further performance gains. Various tree layouts and memory strategies are explored, ultimately resulting in a highly efficient solution that reduces query time from 1150ns to 24ns on a 1GB dataset.