Evaluating LLMs in Text Adventures: A Novel Approach

2025-08-12

This article proposes a novel method for evaluating the capabilities of large language models (LLMs) in text adventure games. The approach involves setting a turn limit and defining a set of in-game achievements to measure how well an LLM can progress within those constraints. Due to the high degree of freedom and branching in text adventures, this method isn't designed to provide an absolute performance score, but rather to offer a relative comparison between different LLMs. The LLM is given a series of achievement goals and a limited number of turns to achieve them; the final score is based on the number of achievements completed. Even powerful LLMs struggle to explore all branches within the turn limit, making the score a reflection of relative capability rather than absolute gaming skill.

Read more

Text Adventure Development: Balancing Scope and Detail

2025-07-07

Developing text adventures requires careful scope management. The author recounts three attempts, starting with overly ambitious goals and progressively scaling down until finally completing a game. The article explores the dimensions of 'breadth' and 'detail' in text adventure design and the trade-offs between them. The author compares the detail-focused Lockout with the breadth-focused The Plot of the Phantom, analyzing the advantages and disadvantages of each style. Modern players tend to prefer detailed experiences. The author concludes by discussing the cost and time commitment of text adventure development and how managing scope is crucial for creating a fun game.

Read more

FizzBuzz in Monads: A Functional Approach

2025-05-26

This article presents a functional programming approach to the FizzBuzz problem using Monads. The core idea leverages the guard-sequence pattern to check divisibility by 3, 5, and 7, generating 'fizz', 'buzz', and 'zork' respectively, or Nothing if not divisible. `mconcat` combines the results, and `fromMaybe` handles Nothing values, yielding the correct FizzBuzz output. This elegant solution showcases the power of functional programming.

Read more
Development

Advent of Code: Elegant Solution to a Stateful Parsing Problem

2025-04-09

The latest Advent of Code puzzle involves interpreting `do()` and `don't()` instructions that enable or disable the contribution of `mul` instructions to a sum. Regular expressions struggle with this statefulness, as they recognize stateless regular languages. The author uses a parser-based solution, lifting it into a state transformer to create a stateful parser. This parser efficiently handles `do()`, `don't()`, and `mul` instructions, processing roughly 1MB of input in 0.12 seconds—a significant improvement over a regex-based approach.

Read more

Haskell: Surprisingly Procedural?

2025-01-19

This article challenges the common misconceptions surrounding Haskell, arguing that it excels as a procedural language. It delves into Haskell's treatment of side effects as first-class values, explaining the underlying mechanics of `do` blocks and demonstrating the use of functions like `pure`, `fmap`, and `liftA2` to manipulate them. The author showcases `sequenceA` and `traverse` for handling collections of side effects and illustrates how these features enable efficient metaprogramming. A complex example demonstrates Haskell's strengths in managing state and caching, contrasting it with other languages' limitations. The article also explores advanced concepts like the `State` monad for improved control and streaming results.

Read more
Development Side Effects

The Kelly Criterion: A Mathematical Approach to Insurance Decisions

2024-12-21

This article explores how the Kelly criterion can be used to make rational decisions about insurance. The author debunks common misconceptions about insurance, arguing it's a mathematical, not philosophical, problem. The core idea is that insurance prevents significant wealth drawdown, accelerating compound interest growth. A formula is presented to calculate the value (V) of insurance, considering current wealth, premium, accident probability, and cost. Motorcycle and helicopter insurance examples illustrate the calculations and deductible's impact. The author explains how insurance companies profit and the relativity of costs.

Read more