The Europeana Newspapers Project Will…

1. Make Digital Newspapers Easier To Search

refined newspaper
A newspaper separated into articles through techniques such as Optical Layout Recognition (OLR).

When newspapers are digitised, the resulting electronic version is often simply an image of the newspaper. It is not always possible to effectively search for images, articles or individual terms within the text.

Europeana Newspapers aims to change that. It will create full-text versions of about 10 million newspaper pages. It will also detect and tag millions of single articles with related metadata and named entities (information identifying people, locations etc.). This will dramatically improve the experience of users, compared to earlier digital newspaper projects.

2. Put Digital Newspapers Within Everyone’s Reach

Brainstorming what a content browser might look like for digital newspapers.
Brainstorming what a content browser might look like for digital newspapers.

Many of the newspaper pages assembled by Europeana Newspapers will be dedicated to the public domain. All titles will be freely searchable through The European Library (which is also creating a special content browser for the project’s newspaper content) and Europeana.

3. Create Tools That Help Experts To Assess Quality

Since the process for converting paper newspapers to digital versions is not 100% accurate, the quality of digitised newspapers must be continually assessed.

The Europeana Newspapers project will help by developing an evaluation and quality-assessment infrastructure for newspaper digitisation. It will establish accepted baselines for accuracy in relation to the level of detail, speed of digitisation and costs. This will in turn help experts to assess different methods of newspaper digitisation and pick the one that gives the best result.

4. Assemble An Overview Of Newspaper Digitisation In Europe

Our 2012 survey (pdf) aimed to identify and analyse all newspaper collections digitised by national, research and public libraries in Europe. It revealed the problem of making 20th century content available, and the fact that many libraries do not use any form of Optical Character Recognition when they scan their newspaper content. The survey is being reconducted in 2013 to give an even more complete picture.

5. Create Best-Practice Recommendations For Newspaper Metadata

We are working to design and release a comprehensive metadata model based on de-facto standards such as METS and ALTO. Partners will share the model with stakeholders in order to find a common agreement and to make it a best-practice example for newspaper digitization in Europe.

6. Raise Awareness Through Workshops and Information Days

Anyone interested in the digitisation of newspaper content can learn more through our workshops and information days. These are organised by project partners and more information is available on our Events page. Topics covered will include the technical challenges of the project, content and policy related issues addressed by the project.

 

One thought on “The Europeana Newspapers Project Will…

  1. Pingback: 10 milionów stron europejskich gazet wkrótce w wersji pełnotekstowej