Critique of Misleading Benchmarks in Formal Methods

2025-05-22
Critique of Misleading Benchmarks in Formal Methods

A paper uses misleading statistics when applying formal methods to verify operating system code. The author criticizes the flawed methodology of simply comparing "proof-to-code ratios", as it ignores the completeness and complexity of specifications. The article points out that proof size has an approximately quadratic relationship with specification size, and specification complexity is far more important than code size. By analyzing multiple verified systems, the author presents more comprehensive data, including code size, specification size, and proof size, and highlights the role of modularity in reducing verification costs, but also notes that complex systems like seL4 are difficult to modularize. Ultimately, the author calls on the research community to stop using the meaningless "proof-to-code ratio" metric.

Development