BQN Matrix Multiplication Performance Optimization: Cache Blocking and Divide and Conquer

2025-06-27

This article explores optimizing large matrix multiplication performance using the BQN language. The author first uses a simple square partitioning method to effectively utilize cache, achieving a speedup of about six times. Then, a Strassen algorithm based on a divide-and-conquer strategy is introduced and experimentally shown to achieve up to a 9x speedup on large matrices. The article also compares the performance impact of different block sizes and nested tiling strategies, concluding that the performance limit of a pure, single-threaded BQN implementation has essentially been reached.

Read more
Development

A Concise Scheme Interpreter in BQN: A Minimalist Approach

2025-05-26

This article details an attempt to implement a Scheme interpreter using the BQN programming language. Leveraging BQN's concise syntax and powerful array operations, the author achieves a functional Scheme subset interpreter, including basic arithmetic, list manipulation, and metaprogramming capabilities. While not fully R5RS compliant and lacking robust error handling, the implementation's brevity and functionality are impressive. This showcases BQN's application and highlights the elegance of functional programming.

Read more
Development