Revisited Forth: Two Implementations and Reflections on a Quirky Language

2025-08-28

The author revisited Forth, a language first encountered 20 years ago. Over two months, they implemented two Forth interpreters: goforth (in Go) and ctil (in C). goforth, a pure interpreter, is simple but lacks advanced features. ctil, closer to a traditional Forth implementation, allows extending the language using Forth itself, showcasing its power. The author argues that Forth's unique strengths lay in its early hardware context. However, its stack-based model makes it less readable and less practical in modern contexts, best suited as a learning project to understand compiler principles and virtual machines.

Read more

Unification Algorithm: Implementation and Applications

2025-08-18

This post delves into the unification algorithm, a process for automatically solving equations between symbolic terms. It finds extensive use in logic programming and type inference. Starting with pattern matching, the post builds up to the concept of unification, providing a Python implementation based on Norvig's improved algorithm. The implementation includes data structure definitions, the core `unify` function, helper functions `unify_variable` and `occurs_check`, along with detailed code examples and execution results.

Read more
Development unification

The Elegant Connection Between Polynomial Multiplication, Convolution, and Signal Processing

2025-05-21

This post explores the connection between polynomial multiplication, convolution, and signal processing. It begins by visually explaining polynomial multiplication using tables and diagrams, revealing its fundamental nature as a convolution operation. The post then introduces discrete signals and systems, focusing on linear time-invariant (LTI) systems. It explains that any signal can be decomposed into a sequence of scaled and shifted impulse signals, and the response of an LTI system can be calculated using convolution. Finally, it briefly touches upon the properties of convolution and its relationship to the Fourier transform, highlighting that the Fourier transform of a convolution equals the product of the Fourier transforms of its operands, enabling efficient convolution computation.

Read more
Development convolution polynomials

Bloom Filters: A Probabilistic Data Structure for Efficient Set Membership

2025-05-02

Bloom filters are probabilistic data structures that efficiently test whether an element is a member of a set, using minimal space. By hashing elements to multiple locations in a bit array, Bloom filters offer fast membership testing, though with a small chance of false positives. Ideal for scenarios where most queries return negative, Bloom filters significantly speed up lookups. This article details the underlying principles, implementation (with a Go example), and mathematical derivation. A practical example demonstrates optimal parameter calculation for a billion-item set with a 1% false positive rate, highlighting their effectiveness in large-scale data processing.

Read more

Efficient Transformers: Sparsely-Gated Mixture of Experts (MoE)

2025-04-20

Feed-forward layers in Transformer models are often massive, creating an efficiency bottleneck. Sparsely-Gated Mixture of Experts (MoE) offers an elegant solution. MoE decomposes the large feed-forward layer into multiple smaller 'expert' networks and uses a router to select the optimal subset of experts for each token's computation, significantly reducing computational cost and improving efficiency. This post details the workings of MoE, provides a NumPy implementation, and discusses key issues like expert load balancing.

Read more
Development Model Efficiency

Cross-Entropy: A Deep Dive into the Loss Function for Classification

2025-04-13

This post provides a clear explanation of cross-entropy's role as a loss function in machine learning classification tasks. Starting with information theory concepts like information content and entropy, it builds up to cross-entropy, comparing it to KL divergence. The article concludes by demonstrating the relationship between cross-entropy and maximum likelihood estimation with numerical examples, clarifying its application in machine learning.

Read more

Making Miracles with Four 2s: An Elegant Solution to a Math Puzzle

2025-02-23

A seemingly simple math puzzle: using only four 2s and any mathematical operation, generate any natural number. From elementary school arithmetic to advanced university mathematics, everyone can participate. Initially a seemingly simple challenge, the difficulty increases with the introduction of exponents, factorials, etc. Ultimately, physicist Dirac, using nested square roots and logarithms, found a general solution, elegantly solving this century-old problem, even with just four 2s.

Read more

Python's JIT Decorators: Three Implementation Strategies

2025-02-03

This article delves into the popular JIT decorator pattern in Python, particularly its use in JAX and Triton libraries. The author implements three JIT decorators from scratch using a simplified example: AST-based, bytecode-based, and tracing-based. The AST-based approach directly manipulates the Abstract Syntax Tree; the bytecode-based approach leverages Python's bytecode interpreter; and the tracing-based approach builds an expression IR by tracing function execution at runtime. The article details the advantages and disadvantages of each approach and uses JAX and Numba as examples to illustrate their strategies in real-world applications.

Read more
Development JIT compilation

Implementing Raft: A Deep Dive into Distributed Consensus

2024-12-21

This is the first post in a series detailing the Raft distributed consensus algorithm and its Go implementation. Raft solves the problem of replicating a deterministic state machine across multiple servers, ensuring service availability even with server failures. The post introduces core Raft components: the state machine, log, consensus module, leader/follower roles, and client interaction. It discusses Raft's fault tolerance, the CAP theorem, and the choice of Go as the implementation language. Subsequent posts will delve into the algorithm's implementation.

Read more
Development Distributed Consensus