“I’ve got government video of how to wash your hands or prep for nuclear war,” says Mark Graham, director of the Wayback Machine at the Internet Archive.
“We could easily make a list of .ppt files in all the websites from .mil, the Military Industrial PowerPoint Complex.”.
And the immediate takeaway is that the scale of the Internet Archive today may be as hard to fathom as the scale of the Internet itself.
The archive also maintains a nearby warehouse for storing physical media—not just books, but things like vinyl records, too.
In total, Graham says the Internet Archive adds four petabytes of information per year (that's four million gigabytes, for context).
Graham shares that the Wayback Machine had, in fact, captured 835 instances of the Google homepage that day in January 2018.
To improve these kind of efforts, Graham says the Wayback Machine has been subtly working on improving its user-facing tools. »