A Linux Kernel Thread Lifecycle Gotcha: The Case of the Randomly Dying Chromium Process
While optimizing Recall.ai's Output Media startup latency, an engineer encountered a perplexing bug: the Chromium process would randomly terminate after launch. The root cause was traced to Bubblewrap's `--die-with-parent` flag and the Linux kernel's handling of PR_SET_PDEATHSIG. This flag causes child processes to receive a SIGKILL signal when the parent thread, not the parent process, terminates. Tokio's thread management interacted with this behavior, leading to unexpected Chromium termination when the parent thread was reaped. Removing the flag solved the issue but revealed a little-known quirk of the Linux kernel, underscoring the need for caution when handling the interaction between thread lifecycles and process isolation.