JVM Optimization: A VarInt Tale of Unexpected Results

2025-07-25

While optimizing Java code in a massive distributed data processing platform, the author identified VarInt encoding as a potential optimization target. He wrote a highly optimized VarInt encoder using SIMD instructions, achieving a 4x speedup in benchmarks. However, surprisingly, the optimization yielded no improvement in production. The culprit? The benchmark used random numbers, while real-world numbers tend to be much smaller, rendering the algorithm's worst-case performance irrelevant in practice. The change was ultimately reverted, but the experience served as valuable proof-of-concept for developing and productionizing custom JIT optimizations.

Read more
Development Java Optimization

Dynamic Programming: It's Not What You Think

2025-07-21

The term "dynamic programming" in algorithm studies often causes confusion. 'Dynamic' doesn't refer to its changeability, but rather to the planning aspect of 'programming', originating from the 1950s when engineers planned construction projects as 'process scheduling'. In computer science, dynamic programming means planning the order of sub-steps required to solve a problem. For example, computing the Fibonacci sequence, the 'program' is the sequence of steps to calculate fib(2) to fib(10) in dependency order. This can be planned top-down or bottom-up; the final plan is the same, and both are considered dynamic programming. Richard Bellman coined the term to avoid a Secretary of Defense's aversion to 'mathematical research', cleverly choosing 'dynamic programming' because the adjective 'dynamic' cannot be used pejoratively.

Read more
Development

Linux Kernel Word Frequency Analyzer

2025-06-16

A website uses a powerful search engine to analyze the frequency of words, names, and functions in the Linux kernel source code. Users can input keywords (supporting wildcards and regular expressions) to view the results. The website also provides interactive charts (requires enabling JavaScript) for a visual representation of the analysis results. This is very helpful for researching the Linux kernel or understanding its code structure.

Read more