16TB Archive of US Federal Public Datasets Released

2025-02-07
16TB Archive of US Federal Public Datasets Released

Harvard Law School researchers have released a 16TB archive containing over 311,000 datasets, a complete archive of data.gov from 2024 and 2025. The project aims to preserve the integrity and authenticity of data by maintaining detailed metadata and digital signatures, making it easier for researchers and the public to cite and access this information over time. Open-source software and documentation are also released to enable others to replicate the work and create similar repositories. The project is supported by the Filecoin Foundation and the Rockefeller Brothers Fund.

Read more

Century-Scale Digital Storage: A Race Against Time

2024-12-14
Century-Scale Digital Storage: A Race Against Time

This article explores the challenge of storing digital data for 100 years. From the invention of IBM's first hard drive-equipped computer, RAMAC, to the prevalence of cloud storage today, the author analyzes the advantages and disadvantages of various storage methods, including hard drives, cloud storage, removable media, and physical imprinting or printing. The article highlights the threats to long-term data preservation, such as physical damage to hardware, software updates, institutional changes, and market fluctuations. Ultimately, the author argues that the key to century-scale digital storage lies in establishing a culture that values maintenance and preservation, requiring a collective effort from all sectors of society to combat the erosion of time and safeguard humanity's digital heritage.

Read more