Marginalia Search Engine Upgrades: Online Status and Ownership Change Detection
2025-06-19
The Marginalia Search Engine team implemented a new system, 'ping-process,' to detect server online status and significant website changes, including ownership transfers and parking. Primarily using HTTP HEAD requests and DNS queries, the system analyzes certificate details, security posture, and server headers to identify changes. Data is stored in 'snapshot' and 'event' tables, the former holding current information and the latter historical events. The system overcame scheduling and certificate validation challenges, showing early success in identifying parked domains. Future plans include refining the ownership change detection model and integrating it into crawler strategies for improved efficiency.