Watchdog Timers: A Necessary Evil (or Essential Good)?
This article explores the critical role of watchdog timers in embedded systems. Using the failure of the Clementine spacecraft mission due to a poorly implemented watchdog and the need to reboot a kitchen exhaust fan as examples, the author stresses the importance of reliable watchdog timers in preventing software failures. The article details various watchdog timer designs, including internal and external options, and offers strategies for building highly reliable watchdog timers. These include employing windowed watchdogs, external CPU-independent watchdogs, and monitoring the state of all tasks in a multitasking system. The author argues for the inclusion of watchdog timers even in simple systems, advocating for techniques like periodic data structure resets to enhance reliability.
Read more