The Stack Unwinding Conundrum in Perf

2025-01-31
The Stack Unwinding Conundrum in Perf

Perf, a powerful performance analysis tool, uses PMU counter overflow interrupts to capture thread states for profiling. However, stack unwinding presents a challenge. Modern compilers omit frame pointers by default, making stack backtracing difficult. While recompiling with -fno-omit-frame-pointer is possible, it's expensive and can lead to system library incompatibilities. DWARF offers an alternative, but its complexity and performance overhead are substantial, leading Linus Torvalds to reject its use in kernel stack unwinding. Perf thus employs a compromise: copying only the top portion of the stack to userspace for unwinding. This limits stack size (65,528 bytes) but effectively balances performance and practicality.