The Europeana Newspapers Project is improving access to millions of digitised historical newspapers from 22 European countries. Who is improving access to digitised newspapers from other parts of the world? For this series of articles we want to dive into newspaper archives. This time we interviewed Deborah Thomas from Chronicling America, the website hosted by the Library of Congress that gives you access to newspapers from the US.
Can you briefly describe Chronicling America and the National Digital Newspapers Program?
Chronicling America: Historic American Newspapers, 1836-1922 is a website built and hosted by the Library of Congress that provides open access to historic newspapers reflecting the development of American culture and communities. The site currently includes more than 7.8 million pages selected and digitized by state cultural heritage institutions across the United States. Chronicling America is sponsored by the National Digital Newspaper Program, a joint partnership between the Library of Congress (LC)and the U.S. National Endowment for the Humanities (NEH) to build a national database of selected historic newspaper content free and open to the public.
What newspaper content can you find in Chronicling America? Do you have a favorite article?
Currently the site provides access to about 7.8 million pages published in 33 states and the District of Columbia, but the site is updated with additional newspapers as we receive them from partners so those numbers increase frequently. In the next few months we’ll be adding hundreds of thousands of pages from many of the existing partners as well as new partners in the states of Connecticut, Idaho, Mississippi and Puerto Rico, a U.S. territory. (Users can see what’s been added recently via the RSS feed) Eventually, we plan to include material from all U.S. states and territories. The digitized newspapers in Chronicling America were all published between 1836 and 1922 and are freely available to the public. In the last year we’ve also added numerous newspapers representing different immigrant and ethnic groups, including those from French, German, Italian and Spanish-speaking communities and plan to add more in future. For a full list of what’s available now, users can see the All Digitized Newspapers list here.
In addition to the digitized newspapers, we include another important resource called the U.S. Newspaper Directory. The Directory contains descriptions of most newspapers ever published in the U.S., about 150,000 titles, including the 1400 or so available from Chronicling America currently. These searchable descriptions help researchers know where to go for more information if they can’t find the newspaper they want online.
Some of my favorite articles are the outrageous and hyperbolic feature articles describing scientific theories of the time such as “What’s the Matter with the Earth?” from the Washington Times (Washington, DC) in January 1907. Not only is it an interesting glimpse into the preoccupations of the time (i.e., recent destructive earthquakes in Kingston, Jamaica and San Francisco), but it also reveals something about the evolution of scientific thinking.
Searching Chronicling America newspapers couldn’t be easier. The simple search, available from every screen, gives users the ability to select a state for geographic coverage, a set of years if needed, and one or more keywords that can be found near each other (within 5 words). Results are then displayed in a grid of small page images with red highlights indicating where and how many results appear on each page. Pages with clusters of red highlights indicate an article that probably covers the subject of interest. In addition to the simple search Chronicling America provides an advanced search that can narrow down results right away. With the Advanced Search options users can select a specific location, title and/or date to search alone or with one or more keywords and Boolean operators. In addition to single criteria, each search facet can also have multiple values (using the Ctrl key to select multiple criteria). These limits would allow a user to search, for example, for South Carolina and Pennsylvania newspapers published between 12 April 1861 and 25 April 1861 that mention the phrase “Fort Sumter.” Such a search would quickly provide interesting page results representing very different reactions (North and South) to the events that began the American Civil War.
How does Chronicling America exploit its newspaper content?
The members of the NDNP- LC, NEH and our state partners- use many different channels to connect users to the wonderful material found in Chronicling America. Both the Library and the NEH post regular blog articles that utilize interesting news and vignettes from Chronicling America reaching a wide variety of users, including education communities, genealogists, and casual readers. Many of our state participants do the same, combining the Chronicling America newspaper content with content from their other online historical collections to delve into historical events and themes.
Some state partners attend local and regional conferences presenting papers and displays describing their involvement and the overall program. They participate in state history fairs, exhibitions and “road show” tours, showcasing the local and regional history to be found in newspapers published across the U.S., not just their own state’s papers. Both the Library and NEH sponsor educational programs, as do many of our partners, that include newspapers from Chronicling America as part of the lesson plans and primary source guides available to teachers and students. The NEH sponsors special prizes in the annual U.S. National History Day competition for students that use Chronicling America as a primary resource. National History Day is a year-long academic program focused on historical research for primary and secondary-level students with more than 500,000 students participating annually. In addition, our more than 150 “Recommended Topic” guides help users explore the newspapers, providing terms and dates for particular subjects and links to relevant articles (e.g., the sinking of the RMS Titanic, events of World War I, or the rise of “flapper” culture in the 1920s.)
From a technical point of view, we have designed the Chronicling America software to expose the data for indexing and harvesting externally to the LC interface. This allows search engines and other kinds of users to take advantage of the data itself and incorporate it into their own products. We also have available Web-based bulk downloads for certain parts of the content. Opening the site this way allows the newspapers to appear in search engine results, specialized historical content datasets, interactive tools, and other humanities research providing new analyses of historical events and trends.
Could you say something about the users of Chronicling America?
Based on self-identified survey responders and what we find repurposed on the Web, it appears our users are a combination of genealogists, academic researchers, and local historians, along with teachers, students and life-long learners. Family names and genealogical terms (such as “obituary”) are frequently the top search terms used on the site. Other common searches focus on historic people or things (such as “Butch Cassidy” or “League of Nations”), or even simple cultural terms used often in a particular period (“secession” or “base ball”[sic]). And, as with today’s news, our ongoing fascination with crime, disasters and sports is well-represented in Chronicling America search interests too.
In online magazines, personal blogs, Twitter feeds, Flickr sets, Pinterest projects, and elsewhere, we find many types of users using different tools to reveal interesting bits of history found in the newspapers. Some users focus on specific types of information to be found in the newspapers and re-use what they find in their own way – from a New York blogger using the site to identify word origins (when certain terms came into use) to a local Washington, DC historian re-using illustrations from there papers of buildings that no longer exist to create image sets in Flickr and combining it with Google Maps to “mash-up” a novel overview of urban architecture.
How do researchers use Chronicling America?
As I mentioned earlier, there are so many wide-ranging uses of the content found in Chronicling America. The combination of breadth and depth of news coverage made available in the site plus full-text searchability has allowed scholars and researchers to exploit what they find in the collection as never before. They are able to refine their research techniques in both scope and efficiency, identifying more and more details of history with less effort. In addition, providing access to large amounts of data via machine harvesting and analysis has encouraged some researchers (historians, literature specialists) to work across disciplines (computer scientists, linguists, epidemiologists) to explore historic events. Examples include an interactive visualizing the progression of newspaper publishing across the U.S., using data-mining and sentiment analysis to identify the effects of the press on the spread of disease during the 1918 influenza epidemic or other kinds of analysis to reveal how influential articles were transmitted “virally” across the U.S. through newspapers. Here we listed the links to these projects and more.
Which barriers does Chronicling America have to address when enabling access to digitised newspapers? And how do you address them?
Many of the challenges to making digitized historical newspapers available are also the benefits of making digitized historical newspapers available. By nature, newspapers are prolific, variant in subject and style and play different roles in the communities that generated them. In the U.S. there are many more newspapers in the historical record than anyone is ever likely to be able to provide access to digitally. Determining how to select from what is significant and what is available is a critical aspect of building the resource. In Chronicling America, this selection is left to those who know the community history best, the state organizations that maintain the physical material. (For this reason, each digitized newspaper is accompanied by a brief essay describing its historical significance to the place and time that produced it.) Newspapers are also complex information objects in how the content is presented, arranged, augmented and published. This makes them extremely useful for many disciplines of study and purposes. However, it can be a challenge to design an online experience that both incorporates masses of documents and allows users to understand meaningful variations in arrangement, organization or publishing history (e.g., multiple editions, sections, or publishing frequencies, as well unusual events).
Then there are the physical traits of the original material, either in paper, or as is more often in the case in the U.S., in microfilm. Printed on poor quality paper with speed of production and distribution paramount, newspapers were often meant to spread information fast and for profit, putting the finer points of reproduction quality aside. Saving copies for the historic record was mainly an afterthought. Ink coverage could be incomplete, typographic equipment endured impressive wear and tear that resulted in broken type, rushed printers made mistakes in layout or arrangement. In addition, once distributed, newspapers were, if kept at all, often stored in basements or attics, over-exposed to light or damp, causing the paper quality to degrade rapidly. Compounding this, once their value to the historic record became apparent and preservation copies were made, the early decades of microfilming often saw poor filming techniques, under- or over-exposed film, etc. All of these factors can reduce our ability to make use of the content today and diminish readability of the digital image and accuracy of the searchable text. Our presentation in Chronicling America attempts to compensate for these challenges by providing viewers with visual search results and full-page images at high resolution that can be magnified for additional readability. Further refining of the content is left to the user, who can impose additional search limits or extract the material for use outside the Web site.
What is the value of international collaboration in enabling access to digitised newspapers?
While historic newspapers may have been published for a local community, they often included articles about events, people and places around the world, encouraging research interests beyond national borders. The more institutions involved in making newspapers available are able to collaborate, through open access, common principles, and shared experiences, the more researchers will benefit.
What are the plans for the future?
More newspapers, more state partners, more access!