Spatially Enabling a Millennial Global City Population Dataset

2025-06-18
Spatially Enabling a Millennial Global City Population Dataset

This paper details the creation of a massive global city population dataset integrating the work of Chandler and Modelski, spanning 3700 BC to 2000 AD. The original data, residing in print books and disparate digital formats, presented significant digitization and spatialization (geocoding) challenges. OCR attempts failed due to font and page quality issues, necessitating manual transcription. Geocoding leveraged CartoDB, GeoNames, the Ancient Locations database, and the Getty Thesaurus, with manual verification crucial for accuracy. The final dataset contains 1599 city locations, offering broad global and temporal coverage, yet limitations remain: data sparsity, ambiguous city definitions, and uncertainties in ancient city locations. Despite these, the digitized and spatialized dataset offers readily accessible data for researchers (historians, geographers, ecologists, etc.) to analyze global urbanization trends.