Five Ways to Model Polymorphic Data in Relational Databases

2025-07-09
Five Ways to Model Polymorphic Data in Relational Databases

This article explores five approaches to modeling polymorphic data in relational databases: single table, nullable foreign keys, tagged union, child-to-parent foreign keys, and JSON. Each method has its pros and cons; for example, the single table approach is simple but can be slow, while JSON is easily extensible but lacks data validation. The author suggests choosing the method that's easiest to read, maintain, and debug, and avoiding premature optimization.

Read more

Multiple Discoveries: The Case of Prolly Trees

2025-07-01
Multiple Discoveries: The Case of Prolly Trees

Prolly trees, a novel data structure crucial to Dolt, weren't invented once, but at least four times independently. From Avery Pennarun's 2009 bup project (which predates even Noms), to Noms' 2015 coining of the term, to Inria's 2019 'Merkle Search Trees,' and DePaul University's 2020 'Content-Defined Merkle Trees,' the same fundamental data structure emerged repeatedly in different contexts. This highlights the common phenomenon of multiple discovery in science and underscores the role of demand in technological innovation. The authors, from DoltHub, discuss this phenomenon and its implications for future technology, using their own experience with prolly trees as a case study.

Read more
Development Multiple Discovery

Dolt's go-mysql-server at Five: A Query's Journey

2025-04-27
Dolt's go-mysql-server at Five: A Query's Journey

This post reflects on five years of Dolt using go-mysql-server, detailing the inner workings of its SQL engine. It walks through a query's journey from parsing to result spooling, encompassing parsing, binding, plan simplification, join exploration, cost-based optimization, and execution. Dolt employs a left-recursive parser and bottom-up dynamic programming for query plan optimization, selecting the optimal execution strategy using a cost model. The post also discusses memory management and future optimizations, such as unifying intermediate representations and reducing memory churn.

Read more
Development

Go's Surprising Memory Allocation Trap: A 30% Regression Story

2025-04-21
Go's Surprising Memory Allocation Trap: A 30% Regression Story

A seemingly innocuous refactoring in a Go project led to a 30% performance regression. The culprit was the `GetBytes` method of the `ImmutableValue` struct, which used a value receiver, causing a heap allocation on every call. Heap allocations are significantly more expensive than stack allocations. The root cause was the Go compiler's escape analysis being imprecise; it failed to recognize that the value receiver wouldn't escape. Switching to a pointer receiver fixed the problem. This case highlights the importance of understanding the Go compiler's memory allocation decisions and using appropriate receiver types for high-performance Go code.

Read more
Development

Cursor: AI Code Editor – Hype vs. Reality

2025-03-29
Cursor: AI Code Editor – Hype vs. Reality

A Dolt Database developer tested the AI code editor Cursor to see if it lives up to the hype of 10x productivity. Initial attempts using Cursor on a large codebase were underwhelming, with debugging proving cumbersome. However, when creating a new project, Cursor excelled, generating a Factorio mod in a few hours. In a work project, Cursor efficiently generated basic functionality but required significant refactoring. The author concludes Cursor delivered around a 50% productivity boost, far short of the claimed 10x, citing limitations in handling complex code and understanding existing codebases.

Read more
Development