Bayes, Bits & Brains: A Probability and Information Theory Adventure

2025-09-01

This website delves into probability and information theory, explaining how they illuminate machine learning and the world around us. Intriguing riddles, such as predicting the next letter in Wikipedia snippets and comparing your performance to neural networks, lead to explorations of information content, KL divergence, entropy, cross-entropy, and more. The course will cover maximum likelihood estimation, the maximum entropy principle, logits, softmax, Gaussian functions, and setting up loss functions, ultimately revealing connections between compression algorithms and large language models. Ready to dive down the rabbit hole?

AI