Category: Development

From SQL Server to Key-Value Stores: A Postmortem of a Database Rewrite Gone Wrong

2025-06-15

A senior developer recounts their experience with a database rewrite at a previous company. The original system, using SQL Server, suffered from performance bottlenecks and frequent outages due to complex stored procedures. The rewrite opted for simple key-value stores, but due to data model mismatch and lack of transaction support, I/O operations skyrocketed, performance worsened, and a complex checkpointing system was introduced. The rewrite ultimately failed to solve the original problems and created new challenges. This humorous account reflects on the importance of technology selection and architectural design, and the negative impact of oversimplification.

Development database rewrite

A Hacker's Guide to Compiling C Programs on macOS

2025-06-14

This post details the author's journey compiling C/C++ programs on macOS, highlighting the challenges and workarounds encountered. From installing compilers and managing dependencies (using tools like `apt-get` and `brew`), to navigating `Makefile`s and `./configure` scripts, the author provides a practical guide. Key takeaways include handling compiler warnings, resolving linker errors using environment variables like `CPPFLAGS` and `LDLIBS`, and selectively compiling targets with `make`. The author also emphasizes the importance of understanding C compilation, even for non-C programmers, due to its prevalence in system tools and libraries. The post concludes with the author's successful contribution of a compiled package to Homebrew, showcasing the practical benefits of mastering the compilation process.

Development

Decoupling Low-Level Programming from Systems Design: Rethinking "Systems Programming"

2025-06-14

This article explores the evolution of the term "systems programming." The author argues that it conflates two distinct ideas: low-level programming (dealing with machine implementation details) and systems design (creating and managing complex interacting components). From the 1970s improvements on assembly to the rise of scripting languages in the 1990s and the performance advancements of today's languages, the boundaries of systems programming have blurred. The author proposes redefining "systems programming" as "low-level programming," leaving systems design as a separate field. He argues that functional programming principles are valuable in systems design and suggests separating low-level programming and systems design instruction in computer science education to foster cross-pollination of ideas.

Development systems design

Rocky Linux 10 Released: Divergence Widens Among RHEL Alternatives

2025-06-14
Rocky Linux 10 Released: Divergence Widens Among RHEL Alternatives

Rocky Linux 10, "Red Quartz," has reached general availability, adding RISC-V support but dropping older Raspberry Pi models. Compared to AlmaLinux 10 and RHEL 10, released earlier this year, subtle differences emerge in both hardware and software. Most notably, RHEL 10 and Rocky Linux 10 require x86-64-v3 CPUs, while AlmaLinux 10 uniquely supports x86-64-v2. Furthermore, RHEL 10's AI assistant, "Lightspeed," is absent from Rocky Linux 10. While functionally similar, Rocky Linux 10 is subtly diverging from its RHEL alternatives in hardware compatibility, AI features, and commercial support, carving its own niche in the market.

Development

libc-less Programming: Mastering Linux Syscalls with strace

2025-06-14

The author recently embarked on building software without libc to gain a deeper understanding of Linux syscalls and internals. This involved creating a minimal shell, a Snake game, a pure ARM64 assembly HTTP server, and a threads implementation. Debugging heavily relied on strace, and the article details numerous useful strace options and flags. These range from tracing child processes and printing verbose struct information to selectively tracing syscalls and even injecting syscall errors for debugging purposes. This provides valuable insights into advanced Linux system programming and debugging techniques.

Development Syscalls

Argparse's Mutually Exclusive Group Nesting Limitation: A Frustrating Conundrum

2025-06-14

Python's argparse module, while offering convenient features for handling command-line arguments, including mutually exclusive groups, has a frustrating limitation when it comes to nesting. Consider a program with multiple timeout settings where users can either adjust individual timeouts or disable them entirely. Argparse doesn't support nesting a 'no-timeout' option within a group of individual timeout options, making configuration cumbersome. While you can nest a mutually exclusive group inside a regular group, the reverse isn't supported, and the official documentation explicitly states this limitation. This forces developers to manually check if specific switches were used, adding complexity.

Development

Lisp Truth Oracle: A Curious Tale of Type Theory, Curry-Howard Isomorphism, and call/cc

2025-06-14

This post attempts to write a "truth oracle" in Lisp—a program that determines the truth or falsehood of arbitrary mathematical statements. The author introduces the Curry-Howard isomorphism, explaining how logical proofs correspond to expressions in typed functional programming. Using Racket's call/cc function (isomorphic to Peirce's law), an attempt is made to implement a program isomorphic to the law of the excluded middle. Unexpectedly, the oracle always returns false until attempting to access an impossible type value, revealing the differences between classical and constructive logic, and the non-standard control flow of call/cc. Finally, the author uses a metaphor of a "devil's bargain" to explain this strange behavior, showcasing the time-travel-like mechanism behind call/cc.

Development type theory

Automating Daily Weather Text Messages

2025-06-14

Tired of opening the weather app every morning? The author explored two methods: First, a Zapier automation sent a daily weather text message around 7 AM. However, lacking customizability and relying on a third party, he built a more flexible system using TypeScript, Twilio, and GitHub Actions. Open-Meteo API provides weather data, Twilio sends SMS messages, and GitHub Actions triggers the task at 6:45 AM daily (accounting for timezones). While the custom summary is less detailed than Zapier's, he gained control and cost-effectiveness, planning improvements to the summary's detail.

Development weather

arXivLabs: Community Collaboration on arXiv Features

2025-06-14
arXivLabs: Community Collaboration on arXiv Features

arXivLabs is a framework enabling collaborators to develop and share new arXiv features directly on the website. Individuals and organizations participating share arXiv's values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners who adhere to them. Got an idea for a valuable project for the arXiv community? Learn more about arXivLabs.

Development

Netflix's Unified Data Architecture: Model Once, Represent Everywhere

2025-06-14
Netflix's Unified Data Architecture: Model Once, Represent Everywhere

Netflix's exploding content offerings — films, series, games, live events, ads — have created a complex web of supporting systems. To tackle duplicated models, inconsistent terminology, and data quality issues, Netflix built the Unified Data Architecture (UDA). UDA is a knowledge graph enabling teams to define models once and reuse them consistently across systems. Leveraging an internal metamodel called Upper, UDA translates domain models into various technical data structures (GraphQL, Avro, SQL, etc.), automating data movement and transformation between containers. This boosts efficiency and data consistency. Two production systems, Primary Data Management (PDM) and Sphere, showcase UDA's power, handling authoritative reference data and self-service operational reporting respectively.

Development Data Architecture

From Quant to BCI: A 2025 Self-Learning Roadmap

2025-06-14

A seasoned engineer with a background in quantitative finance and software development is transitioning into the exciting field of brain-computer interfaces (BCIs). He's embarked on a 12-24 month self-learning journey, structured around three phases: foundational hardware (building a digital clock, amplifying bioelectric signals), intermediate systems (analog/digital radio, FPGA-based signal processing), and advanced topics (closed-loop neural stimulation, wireless data transfer). This ambitious plan combines self-study, hands-on projects, and community engagement, aiming to eventually secure a role in academia, a startup, or industry within the BCI space.

Development Neurotech Self-Learning

Recent Advances in Mixed-Integer Linear Programming (MILP)

2025-06-14

Mixed-integer linear programming (MILP) has become a cornerstone of operations research, thanks to the enhanced efficiency of modern solvers. These solvers can now find globally optimal solutions in seconds for problems previously intractable a decade ago. This versatility has led to successful applications across transportation, logistics, supply chain management, revenue management, finance, telecommunications, and manufacturing. Despite this success, many challenges remain, and MILP is a vibrant area of ongoing research. This article surveys the most significant advancements in MILP solution methods, focusing on computational aspects and recent practical performance improvements, emphasizing studies with computational experiments. The survey is structured around branch-and-cut methods, Dantzig-Wolfe decomposition, and Benders decomposition, concluding with a discussion of ongoing challenges and future directions.

Development Operations Research

Crafting the Worst Possible Python Code: A How-To Guide

2025-06-14
Crafting the Worst Possible Python Code: A How-To Guide

This humorous guide teaches you how to write the most incomprehensible and frustrating Python code imaginable. Through a series of negative examples, such as using cryptic variable names (like `data1`, `temp`) and complex nested loops, the author demonstrates how to create truly terrible code. The ultimate goal is to highlight the importance of writing clean, understandable code and avoiding the creation of unmaintainable technical debt.

Development

Linux Kernel 6.16 Patches Core Dump Vulnerabilities: Saying Goodbye to a 'Stupid' API

2025-06-14

The Linux kernel 6.16 release significantly improves core dump handling, addressing long-standing security vulnerabilities. Previous API designs had flaws, such as core dump handlers running with root privileges, making them attractive attack targets, and race conditions leading to vulnerabilities. The new improvements introduce pidfd to ensure handlers operate on the correct crashed process and allow handlers to bind to a socket for receiving core dumps, reducing privilege escalation risks and effectively preventing attacks.

Development core dump

Volumetric Lighting in React Three Fiber: Raymarching with Post-Processing

2025-06-14
Volumetric Lighting in React Three Fiber: Raymarching with Post-Processing

This article delves into creating realistic volumetric lighting effects in React Three Fiber by combining post-processing and volumetric raymarching. The author meticulously explains coordinate system transformations, reconstructing 3D rays from screen space, and utilizing depth buffers for performance optimization. Advanced techniques like light shaping using SDFs, shadow mapping, and light scattering are covered, culminating in a dynamic volumetric lighting effect with shadows and fog. Multiple demos showcase the technique in archways and space scenes, while also exploring multi-light sources and omnidirectional shadowing.

Green Tea GC: A Memory-Aware Approach to Boosting Go's Performance

2025-06-14
Green Tea GC: A Memory-Aware Approach to Boosting Go's Performance

The Go team is developing Green Tea, an experimental garbage collector designed to address performance bottlenecks of traditional garbage collection algorithms in multi-core systems and non-uniform memory architectures. Green Tea improves spatial and temporal locality by scanning contiguous memory blocks instead of individual objects, significantly reducing garbage collection CPU overhead. Initial evaluations show a 10-50% reduction in GC CPU costs on some GC-heavy workloads. Future work includes exploring SIMD acceleration and a concentrator network for further performance gains.

Development

Claude-Powered WordPress Blogging: A Custom MCP Server

2025-06-14
Claude-Powered WordPress Blogging: A Custom MCP Server

In three days, the author built a custom Model Context Protocol (MCP) server connecting Claude directly to their WordPress blog. This server handles the complexities of the WordPress REST API, enabling Claude to create well-formatted HTML blog posts, automatically manage categories and tags, and even retrieve blog information. The author considers this a significant leap forward in AI-assisted content creation while maintaining editorial control.

Development

arXivLabs: Experimenting with Community Collaboration

2025-06-14
arXivLabs: Experimenting with Community Collaboration

arXivLabs is a framework for collaborators to develop and share new arXiv features directly on the website. Individuals and organizations involved share arXiv's values of openness, community, excellence, and user data privacy. arXiv only works with partners who uphold these values. Have an idea to enhance the arXiv community? Learn more about arXivLabs.

Development

FileDB: A Zig Implementation of a Bitcask-Inspired Key-Value Store

2025-06-14
FileDB: A Zig Implementation of a Bitcask-Inspired Key-Value Store

FileDB is a Zig implementation of a key-value store inspired by Riak's Bitcask paper. It uses a log-structured hash table for metadata and appends records to disk files for high throughput. Periodic compaction and syncing ensure data durability. Benchmark tests of its Redis-compatible client show read speeds exceeding 100,000 requests per second and impressive write performance.

Development key-value database

sandboxfs: A Failed Attempt to Speed Up Bazel's macOS Sandboxing

2025-06-13
sandboxfs: A Failed Attempt to Speed Up Bazel's macOS Sandboxing

A Google engineer attempted to improve Bazel's sandboxing performance on macOS with the sandboxfs project. sandboxfs used a user-space file system to create virtual file hierarchies more efficiently, replacing Bazel's original symlink approach. However, due to the fact that macOS symlink performance wasn't the main bottleneck, along with implementation issues and changes in the macOS ecosystem, sandboxfs was eventually abandoned. Despite this, the author believes its core idea—efficient sandbox creation—still holds promise for solving Bazel's sandboxing performance problems on macOS.

Development

Implementing Datalog in Python: A Relational Database Language More Powerful Than SQL

2025-06-13
Implementing Datalog in Python: A Relational Database Language More Powerful Than SQL

This article demonstrates how to implement Datalog, a relational database language more powerful than SQL, using Python. Datalog, a subset of Prolog, isn't Turing-complete but excels at modeling relationships. The article thoroughly explains Datalog's core concepts, including predicates, facts, rules, and variables, and provides a straightforward Python implementation featuring the Naïve Evaluation algorithm. With this implementation, you can create and query Datalog programs, experiencing the elegance and power of this relational modeling approach.

Development

MUMPS: The Unsung Hero of Healthcare Databases

2025-06-13

MUMPS, a programming language born in the 1960s, was initially developed to manage patient medical records at Massachusetts General Hospital. Its unique integrated database capabilities have made it the dominant database for health information systems and electronic health records in the US, serving over 78% of patients. The history of MUMPS is a story of innovation and adaptation, from its early versions on PDP-7 to today's open-source implementations and commercial products. It has witnessed the rapid evolution of computing technology and continues to provide critical support for the healthcare industry.

Development healthcare IT

arXivLabs: Experimental Projects with Community Collaboration

2025-06-13
arXivLabs: Experimental Projects with Community Collaboration

arXivLabs is a framework for collaborators to develop and share new arXiv features directly on the website. Individuals and organizations working with arXivLabs embrace our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners who adhere to them. Have an idea to improve the arXiv community? Learn more about arXivLabs.

Development

Bloxi: An AI Copilot for Simulink

2025-06-13
Bloxi: An AI Copilot for Simulink

A second-year aero-engineering student at Imperial College London built Bloxi, an AI copilot that translates plain-English prompts into working Simulink control-system models. Leveraging multimodal LLMs, Bloxi builds models step-by-step, allowing for real-time debugging and a more intuitive workflow. The student, who also shares his work to increase productivity for other engineers, released the code, hoping others will improve upon it.

Development Model Building

The Surprisingly Fast Way to Find Vowels in Strings

2025-06-13
The Surprisingly Fast Way to Find Vowels in Strings

This article benchmarks eleven different methods for detecting vowels in strings, from simple loops to regular expressions and even a prime number-based approach. Surprisingly, regular expressions consistently outperform other methods, even simple loops, across various string lengths. A deep dive into Python bytecode and the CPython regex engine reveals the reason for regex's speed. The author concludes that while regex is fastest for most cases, simpler methods suffice unless dealing with millions of strings.

Development string processing

Escaping the Software Goliaths: Towards Freer and Safer Computing

2025-06-13

Frustrated with the expense, unreliability, and slowness of modern software, the author proposes an alternative: favor software with fewer users, infrequent updates, easy modification, and a thriving fork culture. Using his own journey with Lua and the LÖVE game engine as a case study, he details how to build a small, self-sufficient software ecosystem. He encourages readers to fork and modify existing software to meet their needs, ultimately achieving a more free and secure computing experience. This approach champions simplicity and practicality, challenging the drawbacks of traditional software development.

Development

Beyond Hindley-Milner: A Tutorial on the Cubiml Compiler with Algebraic Subtyping

2025-06-13

This blog post series introduces Cubiml, a compiler tutorial built around a novel type inference system called "cubic biunification," an improvement on Algebraic Subtyping. It addresses the limitations of the Hindley-Milner system's lack of subtyping support, providing more powerful and intuitive type inference. The tutorial walks through the implementation of Cubiml with detailed code examples, covering booleans, conditionals, records, functions, let bindings, recursive let bindings, mutual recursion, and case type matching. The ultimate goal is a compiler that type-checks programs without requiring manual type annotations.

Development

arXivLabs: Experimenting with Community Collaboration

2025-06-13
arXivLabs: Experimenting with Community Collaboration

arXivLabs is a framework enabling collaborators to develop and share new arXiv features directly on the website. Individuals and organizations involved share arXiv's values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners adhering to them. Got an idea to enhance the arXiv community? Explore arXivLabs!

Development

Tattoy: Level Up Your Terminal with GPU-Powered Effects

2025-06-13
Tattoy: Level Up Your Terminal with GPU-Powered Effects

Tattoy is a terminal enhancement framework that renders graphics using UTF8 half-blocks, supporting GPU shaders and ShaderToy shaders, and provides a live-updating minimap of the terminal scrollback. It automatically adjusts text contrast, is compatible with existing shells and themes, and allows running commands in the background, such as audio visualizations or system monitors. Plus, Tattoy features a plugin system enabling developers to extend functionality using any language.

OxCaml: Supercharging OCaml for Performance

2025-06-13

OxCaml is a high-performance extension to the OCaml programming language developed by Jane Street. Serving as both their production compiler and an experimental platform, OxCaml aims to improve OCaml's suitability for performance-oriented programming. It offers safe, convenient, and predictable control over performance-critical aspects, focusing on fearless concurrency, memory layout control, and allocation management. While aiming for eventual upstream contribution, some OxCaml extensions are currently non-portable, resulting in libraries exclusive to OxCaml. Open-source and actively seeking experimental users, OxCaml enhances OCaml with quality-of-life improvements like polymorphic parameters and immutable arrays.

Development
1 2 50 51 52 54 56 57 58 203 204