Category: Development

Pluto: A Supercharged Lua Dialect

2025-07-01
Pluto: A Supercharged Lua Dialect

Pluto is a powerful dialect of Lua designed for general-purpose programming. It boasts accelerated development through an enhanced standard library and new syntax features like switch statements, compound operators, and ternary expressions. While largely compatible with Lua 5.4, a compatibility mode resolves potential conflicts from new keywords. Pluto executes Lua bytecode and most Pluto features generate Lua-compatible bytecode. Comprehensive documentation, tooling, and details on improvements are available on its open-source website. Try it out in the interactive browser playground or download pre-built binaries.

Development

Elevating Rust CLIs: Type-Driven Design for Robustness and Maintainability

2025-07-01
Elevating Rust CLIs: Type-Driven Design for Robustness and Maintainability

This article champions a type-driven approach to building command-line interfaces (CLIs) in Rust using the clap crate. Instead of relying on string parsing, the author advocates for defining the CLI interface using Rust's type system. This offers several key advantages: improved code maintainability and readability, reduced test surface area and better mock support for unit tests, and easier semantic versioning. The article details clap's derive and env features, showcasing how to define command-line arguments and environment variables using types, resulting in more robust and maintainable CLIs.

Development

Claude Code Hooks: Extending Functionality with User-Defined Shell Commands

2025-07-01
Claude Code Hooks: Extending Functionality with User-Defined Shell Commands

Claude Code introduces hooks, user-defined shell commands that execute at various stages of its lifecycle. This allows for deterministic control over Claude Code's behavior, ensuring actions like automatic code formatting, logging, and custom permission checks always occur. Hooks transform suggestions into reliable application-level code, enhancing functionality and simplifying workflows. While powerful, users must prioritize security and ensure commands are safe and reliable.

Development Hooks Shell Commands

Qualcomm Open-Sources EUD: In-Circuit Debugging Over USB

2025-07-01
Qualcomm Open-Sources EUD: In-Circuit Debugging Over USB

Qualcomm quietly released the source code for its Embedded USB Debug (EUD) interface, enabling developers to perform SWD debugging directly over USB without external JTAG tools. EUD, integrated into nearly every Qualcomm SoC since ~2018, provides debugging access to CPUs and Hexagon co-processors. While the initial open-source code had some compilation issues, the community quickly addressed them. Currently supporting chips like Snapdragon 845, 855, and 865, it simplifies debugging U-Boot and the secure world, but kernel debugging support is limited, and SMP support is incomplete.

Development

arXivLabs: Community Collaboration on arXiv Features

2025-07-01
arXivLabs: Community Collaboration on arXiv Features

arXivLabs is a framework enabling collaborative development and sharing of new arXiv features directly on the website. Participants must embrace arXiv's values of openness, community, excellence, and user data privacy. Got an idea to enhance the arXiv community? Learn more about arXivLabs.

Development

Nimtable: The Control Plane for Apache Iceberg™

2025-07-01
Nimtable: The Control Plane for Apache Iceberg™

Nimtable is a lightweight, user-friendly platform for monitoring, optimizing, and governing your Iceberg-based lakehouse. Its web-based interface simplifies browsing tables, running queries, analyzing file distributions, and optimizing storage layouts. Supporting multiple catalogs (REST Catalog, AWS Glue, AWS S3 Tables, and PostgreSQL) and seamless integration with object stores like S3, Nimtable offers interactive querying, AI assistance (including AI-generated table summaries and intelligent suggestions), file distribution analysis, and table optimization features (such as file compaction and snapshot expiration management).

Development Data Management

Type-Safe Generics in C: A Clever Use of Unions

2025-07-01
Type-Safe Generics in C: A Clever Use of Unions

This article presents a technique for implementing type-safe generic data structures in C using unions to associate type information with a generic data structure. The author illustrates the approach with a linked list, showing how macros and unions enable compile-time type checking, avoiding the type-unsafety and code bloat of traditional generic methods. Comparisons are made with `void*` and flexible array member approaches, culminating in a solution that provides compile-time type safety, resulting in compiler errors when incorrect types are added.

Development

AI-Assisted LLVM Compiler Optimization: An ASN.1 Serialization Tale

2025-07-01

While maintaining a Rust library for ASN.1 DER serialization, the author discovered inefficient code in integer length calculation. He experimented with Claude AI to optimize the code and used the Alive2 formal verification tool to validate the results. Surprisingly, Claude AI even helped generate a patch for an LLVM compiler optimization, which passed code review and was ultimately submitted to the LLVM project. This demonstrates the immense potential of AI in software development, particularly in compiler optimization, while also highlighting the importance of manual review when using AI tools.

Development

Reverse Engineering Vercel's BotID: A Surprisingly Basic Bot Protection System?

2025-06-30

This post delves into Vercel's newly released BotID anti-bot system, focusing on its free Basic mode. The author reveals that the Basic mode's current detection mechanisms are surprisingly rudimentary and easily bypassed by manipulating browser properties. While BotID collects various signals including browser fingerprints and GPU information, its processing of these signals is basic, failing to effectively identify sophisticated bots. The author speculates that Vercel is using Basic mode to quietly gather data for training future, more robust anti-bot models. The paid Deep Analysis mode, utilizing Kasada's anti-bot scripts, is significantly more complex than Basic mode.

Development

TokenDagger: A Blazing Fast TikToken Implementation

2025-06-30
TokenDagger: A Blazing Fast TikToken Implementation

TokenDagger offers a high-performance alternative to OpenAI's TikToken, optimized for large-scale text processing. Benchmarks show TokenDagger achieving over 4x speedup on code tokenization and a 2x throughput increase compared to TikToken. Leveraging an optimized PCRE2 regex engine and a simplified BPE algorithm to mitigate the performance impact of large special token vocabularies, TokenDagger provides a drop-in replacement. Installation and performance testing are straightforward with a few simple commands.

Development

Ensō (Occult Vampire Keanu) Public Beta Released

2025-06-30
Ensō (Occult Vampire Keanu) Public Beta Released

The new Ensō version, codenamed "Occult Vampire Keanu," is now available for public testing! This release focuses on a simplified UI, improved accessibility, and enhanced privacy. New features include a "Coffeeshop Mode" to conceal text, multiple accessibility-focused themes, and a refined text rendering engine. Future updates will include RTL support and more, but this version significantly improves the user experience.

Development UI update

C Pointer Aliasing and Compiler Optimization: A Game of Source Code Safety

2025-06-30
C Pointer Aliasing and Compiler Optimization: A Game of Source Code Safety

This article delves into the impact of pointer aliasing on program optimization in C. Pointer aliasing refers to two pointers pointing to the same memory object. Compilers, during code optimization, need to perform alias analysis to determine if pointers are aliases. Misjudgment can lead to program errors or performance degradation. The article uses a reciprocal calculation example to illustrate that when two pointers may alias, the compiler cannot perform certain optimizations, as this might alter the program's algorithm. The author also discusses mechanisms in C that aid alias analysis, such as the restrict pointer qualifier and the volatile qualifier, along with advanced alias analysis techniques like type-based and flow-based alias analysis. Finally, the author proposes a novel pointer aliasing analysis model that considers the pointer's lifetime and information flow, aiming to improve compiler optimization efficiency and program safety.

Development Pointer Aliasing

Modeling API Rate Limits as Diophantine Inequalities

2025-06-30

This article explores a mathematical approach, specifically using Diophantine inequalities, to solve API rate limiting problems. The author uses a scenario with a 10-requests-per-hour limit and three retry attempts per task as an example, demonstrating how to transform the task scheduling problem into an integer feasibility problem. By analyzing the task retry pattern and time windows, the author establishes an inequality model and uses Go to write a program that determines whether a new task can be safely scheduled without exceeding the rate limit. The article also mentions optimizing the algorithm to reduce time complexity from O(n^2) to O(n*log(n)).

Cross-Compiling Raylib Lisp Bindings and Games for Windows from Linux

2025-06-30

This article details the process of cross-compiling C code and an SBCL Lisp program for Windows from Linux, using Wine to run a Windows SBCL within a Linux-based Emacs, and loading .dll files into the Lisp image to produce a .exe executable. The author outlines cross-compiling C code using mingw-w64-toolchain, configuring the Raylib library for cross-compilation to generate .dll files, installing and using SBCL within Wine, leveraging vend for dependency management, and finally using sb-ext:save-lisp-and-die to create the Windows executable.

Development

arXivLabs: Experimenting with Community Collaboration

2025-06-30
arXivLabs: Experimenting with Community Collaboration

arXivLabs is a framework enabling collaborators to develop and share new arXiv features directly on the website. Individuals and organizations involved share arXiv's values of openness, community, excellence, and user data privacy. arXiv is committed to these principles and only partners with those who adhere to them. Have an idea to improve the arXiv community? Learn more about arXivLabs.

Development

Rust Error Handling: Evolving from Monolithic Enums to Elegant Error Sets

2025-06-30

Rust's error handling has been a point of contention. The traditional approach of defining massive error enums per module or crate leads to bloated and hard-to-maintain code. This article explores alternatives: representing individual errors with structs and managing error sets using tools like the `error_set` crate. `error_set` simplifies error enum definition and conversion via macros, supporting composition and subset relationships between error sets for cleaner, more efficient error handling. While extra work is still needed for complex errors requiring additional information, `error_set` provides a more elegant and maintainable approach to Rust error handling.

Development

Blazing Fast In-Process Event Dispatcher for Go

2025-06-30
Blazing Fast In-Process Event Dispatcher for Go

This Go package delivers a high-performance, in-process event dispatcher ideal for decoupling modules and enabling asynchronous event handling. Boasting speeds 4-10x faster than channels (processing millions of events per second!), it supports both synchronous and asynchronous operations with a focus on simplicity. Perfect for intra-process module decoupling, lightweight pub/sub, and high-throughput scenarios, but not suitable for inter-process communication, event persistence, or advanced routing.

Development Event Dispatcher

Scaling Customer Container Builds with the Depot API

2025-06-30
Scaling Customer Container Builds with the Depot API

Many SaaS platforms need to run code on behalf of their customers, presenting challenges in container building. This post demonstrates building tools with the Depot API to create isolated build environments for a multi-tenant SaaS platform. Using a Go client, you can create projects, manage project caches, retrieve build metrics, and logs. The Depot API leverages Buf.build, offering client libraries for various languages, making integration into existing infrastructure seamless. The article details creating, deleting, and resetting project caches, fetching build metrics and step details, ultimately enabling scalable and secure customer container infrastructure.

Development container builds

Python Dataclasses: `kw_only=True` for Maintainability and Extensibility

2025-06-30

Python's dataclasses offer a convenient way to create data classes, but the default `__init__` method uses positional arguments, which can lead to maintenance and extension difficulties. This article introduces the `kw_only=True` parameter, which enforces keyword arguments, preventing issues caused by changes in argument order and allowing subclasses to add required fields flexibly. While this parameter was introduced in Python 3.10, the article also provides a solution for compatibility with older versions.

Development

Knuth's 'Premature Optimization is the Root of All Evil' Misunderstood?

2025-06-30
Knuth's 'Premature Optimization is the Root of All Evil' Misunderstood?

This article delves into the actual meaning of Donald Knuth's famous quote, "Premature optimization is the root of all evil." By analyzing examples from Knuth's paper on using goto statements and implementing multisets, the author shows that the quote doesn't entirely discourage small optimizations. Experiments comparing different implementations reveal that even minor optimizations (like loop unrolling) can yield significant performance gains for critical code and frequently used library functions, depending on benchmarking results. The author ultimately advocates for using well-optimized standard library functions to avoid unnecessary optimization efforts and leverage modern compiler optimization capabilities.

Development

The Book of Shaders: A Gentle Introduction to Fragment Shaders

2025-06-30
The Book of Shaders: A Gentle Introduction to Fragment Shaders

The Book of Shaders, authored by Patricio Gonzalez Vivo and Jen Lowe, provides a step-by-step guide to understanding fragment shaders. It gently navigates the complexities of this abstract topic. The book includes author bios, acknowledging numerous contributors and translators who made multiple language versions possible.

Development

Bypassing Malware VM Detection: Spoofing a CPU Fan via Custom SMBIOS

2025-06-30

Malware often checks for the absence of hardware components typically not emulated in virtual machines (like a CPU fan) to evade analysis. This post details how to bypass this detection by modifying the virtual machine's SMBIOS data to spoof a CPU fan. The author thoroughly explains the steps for Xen and QEMU/KVM environments, including obtaining SMBIOS data, creating a custom SMBIOS file, and configuring the VM. The post also highlights the need to additionally handle SMBIOS Type 28 (temperature probe) data in Xen for successful WMI deception.

Development

NativeJIT: A High-Performance JIT Compiler for Bing

2025-06-30
NativeJIT: A High-Performance JIT Compiler for Bing

NativeJIT is an open-source, cross-platform library for high-performance just-in-time compilation of expressions involving C data structures. Developed by the Bing team for use in the Bing search engine, it's crucial for scoring documents based on keyword matches and user intent. Lightweight and fast, it relies only on the standard C++ runtime and runs on Linux, OSX, and Windows. Its optimized code, particularly its register allocation, enables efficient processing of large-scale queries.

Development

Budget Ampere Altra Dev Machine Build

2025-06-30
Budget Ampere Altra Dev Machine Build

Needing a development machine with 64k page size support, the author built a system based on Ampere Altra. He chose an AsrockRack ALTRA8BUD-1L2T motherboard, a used Q80-30 processor (80 cores, 3.0 GHz), an Arctic Freezer 4U-M cooler, and eight 16GB SK Hynix HMA82GR7CJR8N-XN RAM sticks. After some troubleshooting, the system booted successfully. He also selected a suitable case and power supply, adding NVME storage and a graphics card. The total cost was around €1800, slightly over budget. Future plans include installing Fedora 42, creating RHEL and CentOS Stream VMs, experimenting with different GPUs, and desktop usage.

Development Development Machine

LLVM-MCA Performance Analysis: Pitfalls of Vectorization Optimization

2025-06-29
LLVM-MCA Performance Analysis: Pitfalls of Vectorization Optimization

The author encountered a performance degradation issue when vectorizing code using ARM NEON. The initial code used five load instructions (5L), while the optimized version used two loads and three extensions (2L3E) to reduce memory accesses. Surprisingly, the 2L3E version was slower. Using LLVM-MCA for performance analysis revealed that 2L3E caused bottlenecks in CPU execution units, unbalanced resource utilization, and stronger instruction dependencies, leading to performance regression. The 5L version performed better due to its more balanced resource usage and independent load instructions. This case study highlights how seemingly sound optimizations can result in performance degradation if CPU resource contention and instruction dependencies aren't considered; LLVM-MCA proves a valuable tool for analyzing such issues.

Development

Bloom Filters: A Probabilistic Data Structure for Efficient Set Membership

2025-06-29

Bloom filters are probabilistic data structures designed for rapid and memory-efficient set membership testing. They use multiple hash functions to map elements to bits in a bit vector. If all corresponding bits are 1, the element *may* be present; otherwise, it's definitely absent. While prone to false positives, their speed and space efficiency make them ideal for large datasets. This article details Bloom filter principles, hash function selection, sizing, applications, and implementation examples across various systems.

Development

Octelium: A Revolutionary Zero Trust Access Platform

2025-06-29
Octelium: A Revolutionary Zero Trust Access Platform

Octelium is a free and open-source, self-hosted, unified platform for zero trust resource access, designed as a modern alternative to VPNs and similar tools. It's incredibly versatile, functioning as a zero-config VPN, ZTNA platform, secure tunnel infrastructure, API gateway, AI gateway, PaaS for secure and anonymous containerized application hosting, Kubernetes gateway, and even a homelab infrastructure. Octelium offers a scalable zero trust architecture (ZTA) for identity-based, application-layer (L7) aware, secret-less secure access via WireGuard/QUIC tunnels and public clientless access.

Development VPN alternative

The Hidden Copyright War Behind Windows 95's Plug and Play

2025-06-29
The Hidden Copyright War Behind Windows 95's Plug and Play

Implementing Plug and Play in Windows 95 wasn't easy. To make older hardware work with the new feature, engineers employed ingenious workarounds. One amusing example involved manufacturers adding the string "Not Copyright Fabrikam Computer" to their BIOS. This was a clever trick to fool LitWare Word Processor's licensing check, unlocking the full version without actually being a licensed Fabrikam PC. This highlights the challenges of early PC compatibility and the lengths manufacturers went to for software licensing.

Development Plug and Play

IPv4 Down? Linux, WireGuard, and Hetzner Saved My Internet!

2025-06-29

A power outage knocked out my IPv4 internet connectivity, leaving only IPv6, but many websites were inaccessible. I used a Hetzner VPS, WireGuard, and Linux network namespaces to cleverly fix this. By setting up a WireGuard server on the VPS, I tunneled my IPv6 connection to restore IPv4 functionality. Network namespaces allowed me to run my work VPN and Docker without interfering with WireGuard. I also solved WireGuard MTU issues. This whole process highlighted the flexibility and problem-solving power of Linux.

Development

Two Enigmatic Mathematica Programs

2025-06-29

This code showcases two Mathematica programs that generate numerical sequences. The first employs `Do` and `While` loops to iteratively build a sequence whose growth pattern depends on the position of preceding elements. The second program extends the sequence by cumulatively adding prior differences, continuing until the length surpasses 50. Both programs highlight Mathematica's capability in generating intricate sequences, with algorithms warranting further investigation.

Development Sequence Generation
1 2 49 50 51 53 55 56 57 214 215