Offline Reinforcement Learning Boosts Multi-Step Reasoning in LLMs

2024-12-23

Researchers introduce OREO, an offline reinforcement learning method designed to enhance the multi-step reasoning capabilities of large language models (LLMs). Building upon maximum entropy reinforcement learning, OREO jointly learns a policy model and value function by optimizing the soft Bellman equation. This addresses limitations of Direct Preference Optimization (DPO) in multi-step reasoning, specifically the need for extensive paired preference data and the challenge of effective credit assignment. Experiments demonstrate OREO's superiority over existing offline learning methods on benchmarks involving mathematical reasoning and embodied agent control.

Read more

io_uring Gains New Process Creation Functionality

2024-12-20

LWN.net reports on the development of a new process creation feature for the io_uring subsystem. This functionality is implemented via two new io_uring operations: IORING_OP_CLONE, which creates a new process, and IORING_OP_EXEC, which performs an execveat() system call to load a new program. This promises increased efficiency and allows for more complex logic, such as path searching, to be executed asynchronously within the kernel. However, the feature is still in its early stages and has limitations, such as requiring synchronous execution of io_uring operations within the new process. Future development aims to increase flexibility and eventually merge the feature into the mainline Linux kernel.

Read more

Lessons Learned in Long-Term Software Development

2024-12-22

This article summarizes lessons learned in long-term software development, emphasizing the importance of keeping code simple, carefully choosing dependencies, thorough testing, and strong teamwork. Drawing on interactions with Mastodon users and experiences at the Dutch Electoral Board, the author highlights the significant risks of excessive dependencies, complex code, and frequent team turnover in long-term projects. He advises developers to periodically review dependencies, write extensive test cases, and meticulously document code philosophy and design decisions to address the challenges of long-term maintenance and technological change. The article also underscores the benefits of open source and the importance of simple code, cautioning developers against blindly chasing new technologies and opting instead for time-tested solutions.

Read more

Open-Source Game Engine boardgame.io Simplifies Turn-Based Game Development

2024-12-20

boardgame.io is an open-source JavaScript game engine designed to simplify the development of turn-based games. By automatically handling complex aspects like state management, multiplayer networking, and AI opponents, developers can focus on writing game logic. The engine supports multiple game phases, lobbies for matchmaking, prototyping capabilities, and various view layer technologies (such as React and React Native). Its powerful plugin system and traceable game logs further enhance development efficiency and player experience.

Read more
Development turn-based game

AP5 Reference Manual: A Logic-Based Extension to Common Lisp

2024-12-21

AP5 is an extension to Common Lisp that allows users to "program" at a more "specitional" level, focusing on what the machine should do rather than how. It combines aspects of Lisp and the Gist specification language, incorporating compilable parts of Gist and offering annotation mechanisms for performance tuning. AP5 uses a relational model to represent data and supports a first-order logic language for data access and manipulation. Programmers define relations, rules, and constraints, optimizing performance through annotations. The manual details AP5's syntax, database operations, rules, types, equivalence, and implementation specifics, providing numerous examples and explanations.

Read more

Fastmail: Why We Stick With Our Own Hardware

2024-12-22

Fastmail, with 25 years of experience running its own hardware, details why they choose this approach over cloud services. Through careful hardware planning, in-house operational expertise, and maximizing hardware lifespan, they achieve significant cost optimization. From initial SAS and SATA drives to current NVMe SSDs and the ZFS filesystem, Fastmail continually upgrades, leveraging Zstandard compression for increased efficiency and reliability. A cost comparison of cloud storage, HDD upgrades, and building NVMe SSD servers led them to choose the latter for superior reliability, performance, cost-effectiveness, and the ability to fully utilize their internal network.

Read more
Tech hardware

Google Proposes Remedies in DOJ Search Distribution Case

2024-12-21

Google strongly disagrees with and will appeal the Department of Justice's (DOJ) ruling in the search distribution lawsuit. Ahead of an April 2025 hearing, Google submitted its own remedies proposal, focusing on contracts with browser and Android device makers. The proposal aims to give browser companies and device makers more flexibility in choosing default search engines, while ensuring compliance with the court's order and avoiding harm to consumer privacy and US tech leadership. In contrast, the DOJ's proposal is seen as overly interventionist and potentially harmful to consumers and US tech competitiveness.

Read more
Tech

Early Bronze Age Massacre Unearthed in Somerset, UK

2024-12-18

Excavations at Charterhouse Warren in Somerset, UK, have revealed a shocking Early Bronze Age massacre. At least 37 men, women, and children were brutally killed and butchered, their dismembered remains discarded in a 15-meter-deep natural shaft. Cut marks and blunt force trauma on the bones indicate a deliberate act of extreme violence, possibly including cannibalism. This discovery offers a unique insight into prehistoric violence in Britain, challenging previous understandings of social stability during this period and prompting further investigation into the motivations and social context of the event.

Read more

Why HNSW Isn't the Universal Solution for Vector Databases: The Rise of IVF

2024-12-23

HNSW, while popular for its speed and accuracy in vector similarity search, faces limitations in large-scale applications due to its memory-intensive nature. This article argues that disk-based alternatives like IVF (Inverted File Index), especially when combined with quantization techniques (RaBitQ, PQ, SQ, ScaNN), offer superior speed and scalability for massive datasets. IVF, by quantizing and compressing vectors, reduces memory footprint and leverages efficient prefetching and sequential scans for significantly faster search. Insertion and deletion costs are also lower. While HNSW excels in smaller-scale applications, IVF with quantization emerges as the more advantageous choice for massive datasets.

Read more
Development vector database

The Age of Average: Design Homogenization in the Modern World

2024-12-13

From interior design to automobiles and movie posters, a striking homogeneity pervades modern design. The article uses the example of Komar and Melamid's 'People's Choice' paintings to illustrate the surprising uniformity of aesthetic preferences. The sameness of Airbnb interiors, fast-casual architecture, car designs, and brand logos and advertising all exemplify this trend. The author argues this 'Age of Average' isn't accidental but a result of factors like technological constraints, cost pressures, and market convergence. However, this also presents an opportunity; bold brands and courageous companies that dare to be different and distinctive can thrive.

Read more

Programmer Creates Pseudo-3D Game in Bash

2024-12-20

A programmer, izabera, has developed a surprisingly impressive pseudo-3D game using the Bash scripting language. This project, a homage to the classic game Wolfenstein 3D, is open-source on GitHub. The code is concise yet the result is stunning, showcasing the power of Bash and the programmer's ingenuity. The repository includes the game source code and demonstration videos. Developers interested in learning more can check it out on GitHub.

Read more
Development Game Development

Lifelike Raven Animatronic: A Maker's Journey

2024-12-20

This blog chronicles the creation of a highly realistic raven animatronic. The author details the process from initial design and construction to programming intricate movements like beak synchronization with sound and realistic eye blinking. Challenges encountered and solutions implemented are shared, offering valuable insights for aspiring roboticists and anyone interested in the intersection of technology and art. The blog showcases a fascinating blend of creativity and engineering.

Read more
Hardware animatronics

@celine/bibhtml v3.0.3: A Web Components-Based Referencing System

2024-12-21

@celine/bibhtml, a Web Components-based referencing system for HTML documents, has released version 3.0.3. It aims to provide a user experience similar to LaTeX/BibTeX referencing, using Citation.js under the hood and gracefully degrading when citations and references are malformed or JavaScript is disabled. Supporting BibTeX, unstructured text, DOI, and Wikidata formats, it offers three custom elements: ``, ``, and ``, simplifying reference management in HTML.

Read more
Development Reference Management

Groundbreaking Advance: Safely Compiling C to Rust

2024-12-21

Researchers have developed a novel method for safely compiling C code into Rust. This technique utilizes static analysis and type-directed translation to avoid reliance on Rust's `unsafe` blocks, thus guaranteeing memory safety. The method has been successfully applied to code from the HACL* cryptographic library and EverParse libraries, resulting in an 80,000-line pure Rust verified modern cryptographic library—a first of its kind.

Read more
Development C compilation

NASA's GUARDIAN System Uses GNSS Data to Enhance Tsunami Early Warning

2024-12-20

NASA has developed GUARDIAN, a near real-time ionospheric monitoring software system that leverages Global Navigation Satellite System (GNSS) data from NASA's Jet Propulsion Laboratory's (JPL) Global Differential GPS (GDGPS) network to detect natural hazards. By analyzing ionospheric perturbations, GUARDIAN supplements existing early warning systems, particularly for tsunamis. Currently, it's the only system publicly providing multi-GNSS near real-time total electron content (TEC) time series data over the Pacific, significantly contributing to improved tsunami warning accuracy and timeliness.

Read more

Website Requires JavaScript

2024-12-23

The website displays a message indicating that JavaScript needs to be enabled to run the application. This prompts users to check their browser settings and ensure that JavaScript is enabled to access and use the website's features properly.

Read more
Misc

Polyamory Doesn't Liberate; Monogamy Doesn't Protect: A Bay Area Dating Retrospective

2024-12-19

This essay reflects on a decade of dating in the Bay Area, challenging the notion that polyamory is inherently liberating or monogamy inherently protective. Drawing on personal experiences and anecdotes from friends, the author argues that neither relationship style guarantees emotional fulfillment or prevents heartbreak. Statistical data on polyamory is analyzed, revealing complexities and contradictions. The author concludes that the key to successful relationships lies in self-awareness, communication, and addressing personal attachment issues, rather than solely relying on a specific relationship structure.

Read more

Cancer Risk Decreases with Age: Study Unveils Key Protein NUPR1

2024-12-22

A new study sheds light on why cancer risk declines after age 80. Researchers found that elevated levels of a protein called NUPR1 in older mice caused cells to behave as if iron-deficient, limiting cell regeneration and thus suppressing both healthy and cancerous growth. The same mechanism was observed in human cells. Lowering NUPR1 or increasing iron levels boosted cell growth. This discovery could lead to new cancer therapies targeting iron metabolism, particularly in older individuals, and may improve lung function in those with long-term COVID-19 effects. The study also suggests that ferroptosis-based cancer treatments are less effective in older cells due to their functional iron deficiency, highlighting the importance of early intervention. Preventing carcinogenic exposures in younger individuals is even more crucial than previously thought.

Read more

OwlEars Launches OwlBrain AI for Unfiltered Customer Feedback

2024-12-19

OwlEars, the creator of the world-famous feedback platform Sarahah, has launched OwlBrain AI. This new platform allows businesses to collect pure, raw feedback directly from their customers' minds. Unlike lengthy surveys, customers can easily share their thoughts via link, QR code, or website widget. OwlBrain AI provides AI-powered insights to help businesses improve their products and services. A 15-day free trial is available, no credit card required.

Read more

A Wall Conversation Changed My Programming Career

2024-12-21

In 1983, a programmer working at a large defense contractor planned to pursue a Ph.D. in Chemistry. A chance conversation over a wall with the manager of the neighboring "Microcomputer Group" (a tinkerer) led to an invitation to a meeting about Apple II. There, he was tasked with building a VT-100 terminal emulator in 6502 assembly language within a week to enable the company president to read email at home. This experience not only redirected his career path, leading him to join the Microcomputer Group and become the company's sole PC programmer, but also ultimately led him to start his own company. Years later, he reflected on how chance encounters and interpersonal connections significantly shaped his life.

Read more
Development career opportunity

Retro Revival: Bringing a Tandy Coco Back Online with FujiNet

2024-12-20

This article details the author's journey in connecting an old Tandy Coco computer to the internet using the FujiNet project, an ambitious open-source initiative aiming to be the only peripheral needed for vintage computers. The author faced challenges during the assembly process, including soldering difficulties, hardware bugs, and software compatibility issues. Despite these hurdles, they successfully connected to the internet and ran various applications, including an ISS tracker and games. The experience highlights the vibrancy of the open-source community and the potential of retrocomputing, showcasing the fun of hardware repair and software development.

Read more

DataFuel API: Turn Websites into LLM-Ready Data

2024-12-13

DataFuel is a powerful API that transforms websites and knowledge bases into LLM-ready data with a single query. It effortlessly scrapes entire websites, delivering clean, markdown-structured data perfect for RAG systems and AI model training. No complex scraping code is needed. DataFuel offers multiple output formats, including GPT-4 powered extraction for highly accurate results, and a free tier to get started. Trusted by industry leaders, DataFuel simplifies the data preparation process for building powerful AI applications.

Read more

Rust Compiler: A Query-Based Incremental Compilation Architecture

2024-12-13

To address the efficiency issues of traditional pipeline-based compilation, the Rust compiler employs a query-based incremental compilation architecture. This architecture breaks down the compilation process into a series of interdependent queries, utilizing a compilation database to cache intermediate results. This allows recompilation only of necessary code sections. Similar to a build system's dependency management, this significantly improves compilation speed, especially beneficial in scenarios like IDE integration. While introducing complexity, this approach offers a more stable and efficient incremental compilation experience for Rust compared to gradual improvements to traditional methods, now default for development builds.

Read more
3

Cultural Evolution of Cooperation Among LLM Agents

2024-12-18

Researchers investigated whether a 'society' of Large Language Model (LLM) agents can learn mutually beneficial social norms despite incentives to defect. Experiments revealed significant differences in the evolution of cooperation across base models, with Claude 3.5 Sonnet significantly outperforming Gemini 1.5 Flash and GPT-4o. Furthermore, Claude 3.5 Sonnet leveraged a costly punishment mechanism to achieve even higher scores, a feat not replicated by the other models. This study proposes a new benchmark for LLMs focused on the societal implications of LLM agent deployment, offering insights into building more robust and cooperative AI agents.

Read more

Hugging Face Open-Sources 'Search and Learn'

2024-12-20

Hugging Face has open-sourced a project called 'Search and Learn,' focusing on the scalability of search and learning methods with massive computation. The project includes replicable experimental results with provided code and configuration files. The research highlights the power of general-purpose methods in scaling with increased computation, emphasizing search and learning as two methods that demonstrate excellent scalability.

Read more

Spotify's Shady Secret: Fake Artists and Inflated Play Counts Exposed

2024-12-21

A year-long investigation reveals Spotify's deceptive practices. A program called "Perfect Fit Content" (PFC) involves partnerships with production companies to create and promote fake artists and tracks, artificially inflating play counts to reduce royalty costs and boost profits. These fake tracks, often ambient, classical, electronic, jazz, or lo-fi, are strategically placed in playlists designed for background listening. The Spotify CEO's significant stock sales around the time of the revelations further fueled controversy. This scandal raises serious concerns about transparency and fairness in the music industry, prompting calls for congressional investigation and a more transparent music streaming ecosystem.

Read more

Lightweight Safety Classification Using Pruned Language Models

2024-12-19

Researchers introduce Layer Enhanced Classification (LEC), a novel lightweight technique for content safety and prompt injection classification in Large Language Models (LLMs). LEC trains a streamlined Penalized Logistic Regression (PLR) classifier on the hidden state of an LLM's optimal intermediate transformer layer. Combining the efficiency of PLR with the sophisticated language understanding of LLMs, LEC outperforms GPT-4o and specialized models. Small general-purpose models like Qwen 2.5 and architectures such as DeBERTa v3 prove robust feature extractors, effectively training with fewer than 100 high-quality examples. Crucially, intermediate transformer layers often outperform the final layer. A single general-purpose LLM can classify content safety, detect prompt injections, and generate output, or smaller LLMs can be pruned to their optimal intermediate layer for feature extraction. Consistent results across architectures suggest robust feature extraction is inherent to many LLMs.

Read more
1 2 25 26 27 29 31 32 33 48 49