One Year Debugging Sleep-Wake Hangs on Linux with AMD GPUs
2025-02-17
The author encountered a persistent issue where their Linux system, equipped with an AMD RX 570 GPU, would crash or hang after attempting to sleep, often resulting in a black screen upon waking. After over a year of intense debugging, involving journal analysis, systemd configuration tweaks, a debug shell, even Ghidra reverse engineering, the root cause was identified as an amdgpu driver bug related to VRAM backup during high memory usage. The solution, finally implemented, leverages the power management notifier API to preemptively back up VRAM before sleep, preventing memory exhaustion errors. This fix is expected in the stable Linux kernel 6.14 release.
Development
GPU driver