Researchers are increasingly aware that content published on the web can change or disappear at any time. This guide identifies the leading archives of web content and provides context about these projects, and about web archiving in general, so that researchers can make informed use of the available resources.
The Internet Archive's Wayback Machine is the largest archive of the World Wide Web, covering more than 279 billion pages, dating back as far as 1996. Pages in the Wayback Machine are captured repeatedly over time. Many pages are available in hundreds or thousands of versions, corresponding to the content of the page on different dates.
The Internet Archive is in the process of a grant-funded project to develop a search engine. A beta version of that search is currently available. It is also possible to access archived sites directly if you know the site's URL.
Save Page Now
It's easy to miss, but the Wayback Machine allows users to archive a page on-demand. Just look in the bottom right corner of their homepage, or if you're on a device with a small screen, scroll to the bottom of the page. If you enter a URL, the page will be captured immediately, provided the site owner allows automated archiving.
You may find that a page you are interested in is not available in the Wayback Machine, or that the content seems incomplete. There are a few reasons why this might be the case.