Nominations sought for the U.S. Federal Government Domain End of Term 2020 Web Archive

This is a guest blog post by Abbie Grotke, Assistant Head, Digital Content Management Section

Crowd in front of Capitol - Cleveland's 2nd inauguration

Washington, D.C. – Crowd in front of Capitol – Cleveland’s 2nd inauguration. Washington D.C, 1893. Photograph. //www.loc.gov/item/00650948/.

You may have noticed that it is presidential election season in the United States, which means it’s also time for web archivists to gather once again to archive United States Federal Government websites during the end of the presidential term. Since 2008, the Library of Congress has participated in a collaborative project to document changes in government websites from one term to the next.

To see the results of our 2008, 2012, and 2016 efforts, and to learn more about what we’re doing in 2020, visit https://end-of-term.github.io/eotarchive/ and follow us on Twitter @eotarchive.

While the project team has access to a number of bulk lists of government domains that help create the list of URLs that we will archive, the public is invited (and encouraged!) to identify priority content by nominating specific URLs, whether they are websites, documents, or datasets. Crawlers don’t always archive as deeply down into a website as you might imagine, so even if we know about a particular domain of a government website, it is still important for you to let us know about specific pages or documents so we don’t miss them. Any federal government websites, including governmental social media accounts, are within scope for the collection. Election-related content such as campaign websites or news websites discussing the election and end of term are out of scope for this project.

Do you know of content that we should be sure to archive? Please submit your URLs here: https://digital2.library.unt.edu/nomination/eth2020/. The first crawls began in early October, but we’ll continue to crawl through, and just after, Inauguration Day in 2021.

Help us preserve the .gov domain for posterity, public access, and long-term preservation. Only YOU can help prevent…link rot!

Gina Jones and 20 Years of Web Archiving at the Library of Congress

Today’s guest blog post is from Gina Jones and Abbie Grotke, both of the Web Archiving Team. As a part of our series looking back at some of the people and stories around our 20th Anniversary of Web Archiving, I wanted to share with you an interview with a person who has been working on […]

In a Web Archives Frame of Mind: Improving Access and Describing the Collections

This is a guest post by Lauren Baker, a Librarian-in-Residence on the Library of Congress Web Archiving Team (a part of the Digital Collections Management & Services Division). The Librarians-in-Residence Program offers early career librarians an opportunity to contribute to Library projects while learning from professionals in the field. In 2018, the Library of Congress […]

Introducing the Computing Cultural Heritage in the Cloud Project

With support from the Andrew W. Mellon Foundation, the LC Labs team will pilot ways to combine cutting edge technology and the collections of the largest library in the world, to support creative new uses of collections. This project will explore service models to support researchers accessing Library of Congress collections in the cloud, with findings shared throughout the 2 year project.

In the Library’s Web Archives: 1,000 U.S. Government PowerPoint Slide Decks

The Digital Content Management section has been working to extract and make available sets of files from the Library’s significant Web Archives holdings. The outcome of the project is a series of web archive file datasets, each containing 1,000 files of related media types selected from .gov domains. You can read more about this series […]

In the Library’s Web Archives: Dig If You Will the Pictures

The Digital Content Management section has been working on a project to extract and make available sets of files from the Library’s significant Web Archives holdings. This is another step to explore the Web Archives and make them more widely accessible and usable. Our aim in creating these sets is to identify reusable, “real world” […]

In the Library’s Web Archives: Totally Tabular Data

The Digital Content Management section has been working on a project to extract and make available sets of files from the Library’s significant Web Archives holdings. This is another step to explore the Web Archives and make them more widely accessible and usable. Our aim in creating these sets is to identify reusable, “real world” content in the Library’s […]

In the Library’s Web Archives: US Government Audio on Shuffle

The Digital Content Management section has been working on a project to extract and make available sets of files from the Library’s significant web archives holdings. This is another step to explore the web archives and make them more widely accessible and usable. Our aim in creating these sets is to identify reusable, “real world” content in the Library’s […]

In the Library’s Web Archives: Sorting through a Set of US Government PDFs

The Digital Content Management section has been working on a project to extract and make available sets of files from the Library’s significant web archives holdings. This is another step to explore the web archives and make them more widely accessible and usable. Our aim in creating these sets is to identify reusable, “real world” content in the Library’s […]

The Library of Congress Web Archives: Dipping a Toe in a Lake of Data

Today’s guest post is from Chase Dooley and Grace Thomas, Digital Collections Specialists on the Library of Congress Web Archiving Team.  Over the last two decades, the Library of Congress Web Archiving Program has acquired and made available over 16,000 web archives, as part of more than 114 event and thematic collections. Each Web Archive […]