Can Databases Replace Caches Entirely?

2025-08-31

This post explores the possibility of databases completely replacing caches. While databases offer some caching capabilities like buffer pools and read replicas, caches excel at low-latency data access, especially for specific data subsets and pre-computed data. To replace caches, databases need to address several challenges: efficiently handling numerous read replicas, enabling partial read replicas, prioritizing specific data, and implementing efficient incremental view maintenance (IVM). The author suggests that combining IVM with partial read replicas might eventually allow databases to partially replace caches, but a gap remains.

Read more
(avi.im)
Development read replicas

SQLite's WAL Mode: Durability vs. Performance Trade-off

2025-08-24

SQLite's WAL (Write-Ahead Log) mode, often used for higher write throughput, compromises data durability compared to the default journal mode. The `synchronous` pragma controls how often fsync is called; the default is NORMAL. In NORMAL mode, WAL files are synced before each checkpoint, and the database file after, but most transactions lack sync operations. For applications where durability isn't critical, NORMAL is sufficient. For guaranteed durability across power loss, `synchronous=FULL` adds a WAL file sync after each transaction commit, increasing durability at the cost of write speed. This explanation, prompted by concerns about SurrealDB potentially sacrificing durability for benchmark performance, clarifies SQLite's approach.

Read more
(avi.im)

SQLite's WAL Mode Checksum Issue: Silent Data Loss

2025-07-25

This post delves into a flaw in SQLite's checksum mechanism within its Write-Ahead Logging (WAL) mode. When a checksum mismatch occurs in a WAL frame, SQLite silently discards the faulty frame and all subsequent frames, even if they are not corrupt. This design, while intentional, leads to potential data loss. The author analyzes the underlying reasons and proposes that SQLite should throw an error upon corruption detection instead of silently discarding data, thus improving data integrity. The discussion also touches upon the context of SQLite's usage in embedded systems and mobile devices, where corruption is more prevalent.

Read more
(avi.im)
Development

SQLite: The Unbelievable Database Legend

2024-12-30

SQLite, the world's most widely deployed database, is maintained by a three-person team, rejecting external contributions, yet conquering the world with its exceptional performance and stability. Born on a US warship to solve server downtime issues, it has become the cornerstone of trillions of databases. SQLite is not open source, but rather public domain software, with fewer restrictions than any open source license. Its rigorous testing process, even simulating extreme situations like operating system crashes, ensures its incredibly high reliability. However, its unique business model—generating revenue through paid support and memberships—is also noteworthy. The legend of SQLite lies not only in its technical prowess but also in the persistence and innovation behind it.

Read more
(avi.im)
Development legend

Bloom Filters: The Secret to Making SQLite 10x Faster

2024-12-22

Researchers cleverly used Bloom filters to make SQLite analytical queries 10x faster. They discovered that SQLite's nested loop joins were inefficient, with much time spent on B-tree probes. By using a Bloom filter before the join operation to quickly filter out rows unlikely to match, and then performing B-tree probes only on potential matches, the number of probes was significantly reduced. Bloom filters have minimal memory overhead and were easy to integrate into SQLite's existing query engine, resulting in a significant performance boost. This improvement has been integrated into SQLite v3.38.0.

Read more
(avi.im)

Rust-based SQLite Rewrite Achieves 100x Tail Latency Reduction

2024-12-16

Researchers from the University of Helsinki and Cambridge have rewritten SQLite in Rust, creating Limbo, a project leveraging asynchronous I/O and io_uring to drastically improve performance. By utilizing asynchronous I/O and storage disaggregation, Limbo achieves up to a 100x reduction in tail latency, particularly beneficial in multi-tenant serverless environments. The key improvement comes from replacing synchronous bytecode instructions with asynchronous counterparts, eliminating blocking and enhancing concurrency. While improvements are most pronounced at high percentiles, this makes Limbo ideal for applications demanding high reliability.

Read more
(avi.im)
Development Asynchronous I/O