AI Inference Costs: Not as Expensive as You Think

2025-08-28
AI Inference Costs: Not as Expensive as You Think

This article challenges the narrative that AI inference is prohibitively expensive and unsustainable. By calculating the costs of running AI inference on H100 GPUs, the author demonstrates that input processing is incredibly cheap (fractions of a cent per million tokens), while output generation is significantly more expensive (dollars per million tokens). This cost asymmetry explains the profitability of some applications (like coding assistants) and the high cost of others (like video generation). The author argues that this cost disparity is often overlooked, leading to an overestimation of AI inference costs, which may benefit incumbents and stifle competition and innovation.

Read more