A Nasty Postgres Bug in Logical Replication Slot Creation, and How We Fixed It
2025-07-15

The ClickPipes team encountered a perplexing bug while creating logical replication slots in PostgreSQL: a query that should have taken seconds was taking hours and couldn't be terminated. Investigation revealed a Postgres bug where, on read replicas, creating a logical replication slot would get stuck in a long sleep loop while waiting for primary transactions to finish, making it impossible to interrupt. The team submitted a patch to the Postgres community adding an interrupt check, effectively resolving the issue. This case highlights how even mature database systems can harbor unexpected edge cases, and the vital role of open-source community collaboration in resolving them.
Development
Logical Replication