Radix Sort Beats Hash Tables: A Performance Showdown for Counting Unique Values

2025-09-11
Radix Sort Beats Hash Tables: A Performance Showdown for Counting Unique Values

In the problem of counting unique values in a large array of mostly-unique uint64s, radix sort, when well-tuned, is typically faster than hash tables. By efficiently utilizing memory bandwidth and cleverly fusing hashing with the sorting process, radix sort achieves up to a 1.5x speedup over tuned hash tables for datasets larger than 1MB, and up to 4x faster than Rust's excellent Swiss Table hash tables. However, radix sort's performance degrades with non-uniform data distributions; using an invertible hash function pre-processes data to maintain efficiency. The article benchmarks both approaches under varying data sizes and access frequencies, and discusses strategy for choosing between them in real-world applications.

Read more
Development

Chipmakers' Software Ecosystem Anxiety

2025-02-27

Chipmakers often worry about others writing software that interfaces with their chips, fearing that poorly written software will reflect badly on their products. This fear stems partly from the close relationship between hardware and software, and partly from an undervaluation of external engineers' capabilities. However, Joy's Law states that "No matter who you are, most of the smartest people work for someone else." Chipmakers need to acknowledge this and actively embrace external engineers to build successful software ecosystems.

Read more

Smooth Transition: Getting Started with Linux from Windows

2025-07-18
Smooth Transition: Getting Started with Linux from Windows

For users switching from Windows to Linux, Linux Mint and Zorin OS are excellent choices. Volunteers should assist users in familiarizing themselves with the Linux environment and finding Linux equivalents to their Windows software. Demonstrations, such as using a live USB or dedicated Linux demo machines, can help users experience Linux firsthand. Dual-booting is an option if users want to keep both Windows 10 and Linux, but volunteers should advise that Windows 10 will become outdated and insecure, and should be used only for specific applications, while Linux should be used for daily tasks.

Read more
Development

SignalGate Continues: 410GB of TeleMessage Data Dumped

2025-05-20
SignalGate Continues: 410GB of TeleMessage Data Dumped

Security researcher Micah Lee revealed a massive 410GB data breach from TeleMessage, an Israeli firm providing archiving services for encrypted messaging apps like Signal and WhatsApp. TeleMessage's software was used by US government officials, leading to the 'SignalGate' scandal. The leaked data includes sensitive information, such as plaintext messages and metadata, highlighting vulnerabilities in TeleMessage's products and the risks associated with government reliance on encrypted message archiving services. The release comes from Distributed Denial of Secrets.

Read more
Tech

Math Academy: Effective Drill or Conceptual Roadblock?

2025-04-13
Math Academy: Effective Drill or Conceptual Roadblock?

Math Academy is a popular online math learning platform praised for its gamified approach. However, reviews from math educators are mixed. The author explores its strengths and weaknesses through personal experience, highlighting its effectiveness in procedural fluency (mastering steps) but its shortcomings in conceptual understanding. Math Academy is best used as a supplement to deepen understanding gained from textbooks or lectures, not as the sole learning method. The author advocates prioritizing conceptual understanding, using tools like Math Academy for targeted practice.

Read more
Education

Reliving the Dawn of Space Exploration: Restored Mercury and Gemini Photos

2025-09-16
Reliving the Dawn of Space Exploration: Restored Mercury and Gemini Photos

Celebrating the 60th anniversary of the Gemini missions, a new book, *Gemini & Mercury Remastered*, vividly brings to life the thrilling early days of American space exploration. Featuring 300 meticulously restored NASA photographs from the Mercury and Gemini programs, the book delves into the stories behind the images, showcasing the courage and pioneering spirit of America's first astronauts. Author Andy Saunders discusses his inspiration and favorite stories in an interview, taking us back to the very beginning of human spaceflight and the momentous first escape from Earth.

Read more

OpenBenches' Address Conundrum: Geolocating 40,000 Benches Elegantly

2025-04-27
OpenBenches' Address Conundrum:  Geolocating 40,000 Benches Elegantly

OpenBenches, a crowdsourced database of nearly 40,000 memorial benches, faces a challenge: converting latitude/longitude coordinates into human-readable addresses. Many benches lack formal addresses, residing in parks, etc. Existing geocoding APIs provide overly detailed or irrelevant information. The author explores using multiple APIs and Points of Interest (POIs) for automated address generation, but encounters issues with language localization, address formatting inconsistencies, and POI accuracy. Balancing address precision with user-friendliness and internationalization remains a key challenge.

Read more

NYC Congestion Pricing Tracker: Real-time Data Visualization

2025-01-06

Benjamin and Joshua Moshes have created a website, the "Congestion Pricing Tracker," that provides real-time data on New York City's congestion pricing. The site features an interactive map and data visualizations, allowing users to easily see congestion pricing rates and traffic conditions in different areas. This is not only useful for individuals planning their commutes, but also provides valuable data for researchers and urban planners to optimize traffic management and policy. It showcases the power of civic tech in addressing urban challenges.

Read more

Rebuilding the C++ Standard Library from Scratch: The Pystd Project

2025-03-25
Rebuilding the C++ Standard Library from Scratch: The Pystd Project

Tired of the C++ Standard Library's (STL) abysmal compile times and unreadability, a self-employed open-source developer decided to build a replacement from scratch: Pystd. Taking inspiration from the Python standard library, he incrementally implemented file handling, string manipulation, UTF-8 validation, hash maps, vectors, and sorting. The result? A functional application in under 1000 lines of code, comparable to the STL version. Pystd boasts significantly faster compilation and smaller executable sizes. A unique versioning scheme (e.g., pystd2025) ensures perfect ABI stability, easing future upgrades and maintenance.

Read more
Development Standard Library

Century-Old Problem Solved: Mathematicians Unify Three Theories of Fluid Physics

2025-04-26
Century-Old Problem Solved: Mathematicians Unify Three Theories of Fluid Physics

Mathematicians from the University of Chicago and the University of Michigan have posted a paper to arXiv claiming to have solved a subgoal of Hilbert's sixth problem: unifying three physical theories describing fluid motion—Newton's laws of motion, the Boltzmann equation, and the Euler-Navier-Stokes equations. The achievement bridges the microscopic, mesoscopic, and macroscopic levels by proving that, in the limit of infinitely many particles with vanishing size, the statistical behavior of Newton's equations converges to the solution of the Boltzmann equation. This strengthens the mathematical foundations of physics.

Read more

NASA's X-59 Quiet Supersonic Jet Completes First Taxi Tests

2025-07-22
NASA's X-59 Quiet Supersonic Jet Completes First Taxi Tests

NASA's X-59 experimental quiet supersonic aircraft successfully completed its first low-speed taxi tests on July 10th at U.S. Air Force Plant 42 in Palmdale, California. This marks a significant step towards the aircraft's first flight, with further high-speed taxi tests planned in the coming weeks. The tests focused on validating critical systems like steering and braking, ensuring the aircraft's stability and control. The X-59 is part of NASA's Quesst mission to demonstrate quieter supersonic flight, aiming to replace the sonic boom with a softer 'thump'. Data collected will inform the development of new noise regulations for supersonic commercial flights.

Read more
Tech

Democrats' Failing Strategy of Mildness: A Game Without Rules

2025-07-22

This article criticizes the Democrats' weak and compromising response to the Republicans' aggressive political tactics. Examples cited include the passive acceptance of DeJoy as Postmaster General, the ineffective response to the rejection of Obama's Supreme Court nominee, and the inaction regarding Trump's incitement of the January 6th insurrection. The author argues that Democrats cling to the illusion of cooperation while Republicans disregard rules and solely pursue victory. This strategic disparity leads to repeated setbacks for the Democrats, ultimately harming their own interests.

Read more
Misc Democrats

Running Fennel from Emacs: A Powerful Extension

2025-07-23
Running Fennel from Emacs: A Powerful Extension

This article introduces `require-fennel.el`, an Emacs extension that enables running Fennel (a Lua dialect) within Emacs. It achieves this by communicating with a Fennel REPL, allowing data conversion and function calls between Emacs Lisp and Fennel. The author demonstrates loading Fennel modules, calling Fennel functions, and using Fennel data structures in Emacs Lisp. Furthermore, the extension supports calling Emacs Lisp functions from Fennel, enabling two-way interaction. This allows developers to leverage Fennel's conciseness and Emacs's power for a more robust Emacs environment.

Read more
Development

DeepSeek Infrastructure Profiling Data Released

2025-02-27
DeepSeek Infrastructure Profiling Data Released

DeepSeek is publicly sharing profiling data from its training and inference framework to help the community understand its communication-computation overlap strategies and low-level implementation details. The data, captured using the PyTorch Profiler, can be visualized directly in Chrome or Edge browsers. The analysis simulates a perfectly balanced MoE routing strategy and covers training, prefilling, and decoding phases. Different configurations (e.g., EP64/TP1, EP32/TP1, EP128/TP1) and micro-batching strategies are optimized for computation and communication overlap to improve efficiency.

Read more
Development Profiling

Mountain Biking Spinal Cord Injuries Surpass Hockey and Other High-Risk Sports

2025-01-08
Mountain Biking Spinal Cord Injuries Surpass Hockey and Other High-Risk Sports

New research from UBC's Faculty of Medicine reveals a shockingly high number of spinal cord injuries from mountain biking, exceeding those from hockey and other high-risk sports. Between 2008 and 2022, 58 people in British Columbia sustained spinal cord injuries while mountain biking, compared to only 3 from ice hockey. In recent years, mountain biking-related injuries have been seven times higher than those from skiing and snowboarding. The annual number in BC rivals or surpasses those from amateur football across the entire US. The study, published in *Neurotrauma Reports*, found most injured were healthy young men (93% male, average age 35.5). 77.5% were injured after going over their handlebars. While most wore helmets (86.3%), this didn't eliminate risk. The estimated lifetime cost of these injuries to BC is $195.4 million. The study calls for increased awareness and a discussion on safety improvements.

Read more

Relational Graph Transformers: Unleashing AI's Potential in Relational Databases

2025-04-28
Relational Graph Transformers: Unleashing AI's Potential in Relational Databases

Traditional machine learning struggles to fully capture the valuable insights hidden in the complex relationships between tables within enterprise data. Relational Graph Transformers (RGTs) represent a breakthrough, treating relational databases as interconnected graphs, eliminating the need for extensive feature engineering and complex data pipelines. RGTs significantly improve the efficiency and accuracy of AI in extracting intelligence from business data, showing immense potential in applications like customer analytics, recommendation systems, fraud detection, and demand forecasting. They offer a powerful new tool for both data scientists and business leaders.

Read more

Lost Nicknames and the Origins of Surnames

2025-02-10
Lost Nicknames and the Origins of Surnames

Many English surnames derive from patronyms, often nicknames. For example, "Jackson" comes from "Jack" (a nickname for John). This article explores numerous now-obscure nicknames and their resulting surnames, such as "Wat" (a nickname for Walter) yielding "Watts," "Watson," "Watkins"; "Gib" (a nickname for Gilbert) yielding "Gibbs," "Gibson"; and "Hob" (a nickname for Robert) yielding "Hobbs," "Hobson," "Hobkins." The author invites further examples and adds the nickname "Hick" (for Richard) and its derivatives, and speculates on "-mott" possibly indicating an in-law.

Read more

Conquering Offline App Sync Nightmares: Hybrid Logical Clocks and CRDTs to the Rescue

2025-09-22
Conquering Offline App Sync Nightmares: Hybrid Logical Clocks and CRDTs to the Rescue

Many offline-first apps fail to deliver on their offline support promises, with data synchronization being a major hurdle. This article presents solutions: Hybrid Logical Clocks (HLCs) solve event ordering issues, ensuring consistent event sequencing across multiple devices even offline; Conflict-Free Replicated Data Types (CRDTs) tackle data conflict problems, such as the Last-Write-Wins (LWW) strategy, guaranteeing eventual data consistency. The author also recommends SQLite as the local database and introduces their built SQLite-Sync extension for simple and reliable cross-platform offline-first applications.

Read more
Development

China Investigates Apple's App Store: Tech Giant Faces New Scrutiny

2025-02-05
China Investigates Apple's App Store: Tech Giant Faces New Scrutiny

China's market regulator is investigating Apple's App Store policies and fees, potentially adding fuel to the US-China trade war. The probe focuses on Apple's up to 30% commission on in-app purchases and its restriction of external payment services and app stores. This stems from long-standing disputes between Apple and developers like Tencent and ByteDance over iOS App Store policies. While not yet a formal investigation, further action could be taken if Apple fails to address concerns. Apple faces intense competition from domestic rivals like Huawei in China, adding pressure amid this regulatory scrutiny.

Read more

Building Cost-Effective AI Production Systems: A Taco Bell Approach to Cloud Computing

2025-03-03
Building Cost-Effective AI Production Systems: A Taco Bell Approach to Cloud Computing

This article explores building cost-effective AI production systems. Drawing parallels to Taco Bell's simplified menu, the author advocates for constructing complex systems using simple, industry-standard components (like S3, Postgres, HTTP). The focus is on minimizing cloud computing costs, particularly network egress fees. By using object storage with zero egress fees (like Tigris) and dynamically scaling compute instances up and down based on demand, costs are dramatically reduced. The importance of choosing dependencies to minimize vendor lock-in is stressed, with an example architecture provided using HTTP requests, DNS lookup, Postgres or object storage, and Kubernetes, allowing for portability across cloud providers.

Read more
AI

Archaeologists Use Lewis & Clark's Laxatives to Find Lost Campsites

2025-09-01

The Lewis and Clark expedition's 600 giant laxative pills, nicknamed "thunder-clappers," contained mercury, a stable compound. Traces of these pills are helping archaeologists pinpoint the expedition's campsites. High mercury levels in soil indicate old latrine pits, and military manuals help reconstruct the camp layouts. This discovery highlights the limitations of early 19th-century medical practices, where "heroic medicine", while sometimes effective, often did more harm than good.

Read more
Tech

Real-time 3D Human Motion Detection and Visualization using WiFi CSI

2025-08-26
Real-time 3D Human Motion Detection and Visualization using WiFi CSI

WiFi-3D-Fusion is an open-source project that leverages Channel State Information (CSI) from local Wi-Fi to perform real-time human motion detection and 3D visualization. Supporting both ESP32-CSI and Nexmon data acquisition, it employs advanced CNNs for person detection and tracking, including multi-person identification and re-identification. A continuous learning pipeline allows the model to automatically improve during operation. Visualization is offered through both a web interface and a terminal-based pipeline. Optional integrations with Person-in-WiFi-3D, NeRF², and 3D Wi-Fi Scanner are also provided.

Read more

Anthropic to Train AI Models on User Data, Opt-Out Required

2025-08-29
Anthropic to Train AI Models on User Data, Opt-Out Required

Anthropic will begin training its AI models, including Claude, on user chat transcripts and coding sessions unless users opt out by September 28th. This affects all consumer tiers, extending data retention to five years. A prominent 'Accept' button in the update notification risks users agreeing without fully understanding the implications. While Anthropic claims data protection measures, users who inadvertently accept can change their preference in settings, though previously used data remains inaccessible.

Read more

arXivLabs: Experimental Projects with Community Collaborators

2025-05-27
arXivLabs: Experimental Projects with Community Collaborators

arXivLabs is a framework that enables collaborators to develop and share new arXiv features directly on the arXiv website. Individuals and organizations working with arXivLabs embrace and adhere to our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners who share them. Have an idea for a project that will benefit the arXiv community? Learn more about arXivLabs.

Read more
Development

DOJ Greenlit to Sell $6.5B in Seized Silk Road Bitcoin

2025-01-10
DOJ Greenlit to Sell $6.5B in Seized Silk Road Bitcoin

The US Department of Justice (DOJ) has received court approval to sell approximately 69,000 Bitcoin seized from the Silk Road darknet marketplace, currently valued at around $6.5 billion. This decision concludes a long-standing legal battle. Despite objections and attempts to block the sale by Battle Born Investments, the DOJ prevailed. Citing Bitcoin's price volatility, the DOJ argued for a swift sale to mitigate potential losses. The sale, managed by the US Marshals Service, will be one of the largest sales of seized cryptocurrency in history.

Read more
Tech Silk Road DOJ

Disney Data Breach: 25-Year-Old Pleads Guilty to Stealing 1TB of Confidential Data

2025-05-03
Disney Data Breach: 25-Year-Old Pleads Guilty to Stealing 1TB of Confidential Data

A 25-year-old California man, Ryan Mitchell Kramer, pleaded guilty to hacking a Disney employee's computer and stealing over 1 terabyte of confidential data. He disguised malware as an AI art generator, gaining access to the victim's computer and subsequently stealing data from numerous Disney Slack channels. This included employee personal information, internal communications, and recruitment data. Kramer then threatened the victim and publicly released the stolen information. Disney and the FBI are investigating the incident.

Read more
Tech

Xbox Cloud Gaming Goes Cross-Device: Seamless Play Across Consoles and PCs

2025-07-22
Xbox Cloud Gaming Goes Cross-Device: Seamless Play Across Consoles and PCs

Microsoft is testing updates to the Xbox PC app and consoles enabling seamless cloud gaming across devices. A new play history section will track cloud games played across Xbox consoles, PCs, and handhelds. This means even console-exclusive games, playable via cloud, will appear in your recent games list on PC and be accessible via Xbox Cloud Gaming. The update enhances cross-device continuity, letting players resume games from where they left off, regardless of platform.

Read more

CodeScientist: An AI-Powered Tool for Automated Scientific Discovery – Costs and Risks

2025-04-09
CodeScientist: An AI-Powered Tool for Automated Scientific Discovery – Costs and Risks

CodeScientist is an autonomous agent leveraging LLMs for automated scientific discovery. It generates, debugs, and runs experiments, but costs vary depending on debugging iterations, prompt size, etc., averaging around $4 per experiment. Users must carefully manage API keys and monitor usage to avoid high costs. The generated code might contain API keys; exclusion patterns are recommended to prevent accidental commits.

Read more
Development Cost Management

AP5 Reference Manual: A Logic-Based Extension to Common Lisp

2024-12-21

AP5 is an extension to Common Lisp that allows users to "program" at a more "specitional" level, focusing on what the machine should do rather than how. It combines aspects of Lisp and the Gist specification language, incorporating compilable parts of Gist and offering annotation mechanisms for performance tuning. AP5 uses a relational model to represent data and supports a first-order logic language for data access and manipulation. Programmers define relations, rules, and constraints, optimizing performance through annotations. The manual details AP5's syntax, database operations, rules, types, equivalence, and implementation specifics, providing numerous examples and explanations.

Read more

The Forecasting Company: Seeking Founding Software Engineer

2025-08-28
The Forecasting Company: Seeking Founding Software Engineer

A startup building the ultimate forecasting foundation model is seeking a founding software engineer. This full-stack role involves developing customer-facing APIs, robust data pipelines, and a web application. Ideal candidates will be proficient in Python and TypeScript, comfortable with React, and have experience building projects from scratch. Benefits include generous equity, daily lunch vouchers, an on-site gym, a mobility pass, full health insurance, and more.

Read more
Development Forecasting Model
1 2 116 117 118 120 122 123 124 596 597