Tenstorrent: Stop Stacking Abstractions, Focus on Core AI Compute

2025-05-25
Tenstorrent: Stop Stacking Abstractions, Focus on Core AI Compute

This post sharply criticizes Tenstorrent's AI compute architecture, arguing its over-reliance on abstraction layers (LLK) leads to inefficiency and prevents it from competing with giants like Nvidia. The author advises Tenstorrent to focus on three core modules: frontend (PyTorch/ONNX, etc.), compiler (MLIR/LLVM, etc.), and runtime. The runtime should be hardware-agnostic, and the compiler should focus on memory placement, op scheduling, and kernel fusion, avoiding unnecessary activation functions like ELU. The author emphasizes that only by simplifying the architecture and improving the performance of core components can Tenstorrent succeed in the AI compute field.

Hardware AI compute