LLM Benchmark: Pelican on a Bicycle

2024-12-16

Simon Willison created a unique LLM benchmark: generating an SVG image of a pelican riding a bicycle. This unusual prompt aimed to test the models' creative abilities without relying on pre-existing training data. He tested 16 models from OpenAI, Anthropic, Google Gemini, and Meta (Llama on Cerebras), revealing significant variations in the quality of generated SVGs. Some models produced surprisingly good results, while others struggled.

Read more

Storing Times for Human Events: Best Practices and Challenges

2024-12-12

This blog post discusses best practices for storing event times on event websites. The author argues that directly storing UTC time loses crucial information, such as the user's original intent and location. A better approach is to store the user's intended time and the event location, then derive the UTC time. Examples like user error, international timezone adjustments, and the 2007 Microsoft Exchange DST update illustrate the importance of storing the user's intended time. The author recommends designing a clear and user-friendly interface to help users accurately set event times and locations, emphasizing the importance of maintaining the user's original intent to avoid errors caused by timezone changes.

Read more