Within the Europeana Newspapers project, we often speak of the value of historic newspapers for the academic community but how exactly might a researcher use the material that we’re gathering?
This month we’re interviewing Amélia Del Rosario Sanz Cabrerizo, a researcher and professor at Universidad Complutense de Madrid. She teaches comparative literature and cyber culture (general understanding of challenges of cyber developments; digital means, transformation of mentalities in daily lives, etc.), as well as an introduction to using databases for undergraduates.
1. Can you briefly describe yourself: your background and the research you’ve done using historic newspapers?
I have worked on Comparative Literature from my PHD up to now, mainly Spanish and French Cultural Relationships. At the very beginning I worked on influences and reception, but I have been moving towards the notions of circulation and transfer, I mean from a national (or nationalist approach) to a more local and global one. At that time, I used to look for one type of item in a single magazine: for example, serialised French novels with German characters in one concrete French Journal in 1917. So I was very familiar with research on just one newspaper, looking for texts mainly.
2. Your work highlights the importance of newspapers as an information source. How did you discover the link between your research topic and the information recorded by the press?
From the 18th century up to now, newspapers have been a primary source for researchers working on how public opinion is built. I read Marc Angenot’s works, particularly 1889 as an example of systematic analysis from newspapers.
Curiously, journalese — the language used in newspapers — is considered a marginal corpus for linguistics, but not for us, as researchers on cultural phenomena: it is one of our main corpus:
- Because it is highly edited, so elaborated, a discursive construction;
- With very specific restrictions, I mean subject to medium;
- It is a plural discourse since it involves a lot of voices and not just influenced by one single author, but by a huge amount of them.
3. Was this the first time that you had used newspapers as a source of information for research, and did it change the way that you perceive newspapers? In other words, were you surprised by what you could research using newspapers?
For my PHD, I worked on last 17th century Spanish romans reception, when newspapers were not so many and not so important, but when I began to work on 18th and 19th century, and after reading Marc Angenot as I said, It was obvious that I had to use newspapers. The problem was which ones (because it was a very time-consuming task) and where (because I had to go to the National Library in Paris or in Madrid so it was a very hard and expensive task).
4. How would you compare newspapers to other sources of information such as books and journals? Are there certain aspects of newspapers that just can’t be replicated anywhere else?
We know that newspapers are the most important format and the most important formula for circulation of ideas, not only in a local context but in a much broader one. If you are interested in circulation, you have to work with newspapers. An example: in the Spanish context, some newspapers are very, very important because they were read in South America as much as in Spain. Mapping this circulation should be a priority in the future, not before massive digitisation processes and virtual libraries.
5. What specific types of information did the newspapers contain that you found valuable, and why were these important for you?
At present, I am using digitised newspapers available in virtual libraries to look for allusions, quotations, publications, translations to some women writers. I am trying to prove that these women writers were read all over Europe but they were banished by critics in the 19th and 20th centuries. I can provide quantitative analysis to prove the importance of some women in their days, or the insignificance of some other women who were consecrated a long time after.
7. In terms of your work process, did you use digital or paper copies of newspapers and what kind of techniques did you use (eg. simple keyword search, text mining)?
I am using digital copies exclusively for research on term extractions from the magazines and newspapers for a very specific period. We use a simple keyword search, even if I have some problems of spelling due to the OCR systems (separation of letters, more towards the end of 18th century or early 19th century). It is never an exhaustive analysis but it is a very interesting experience for students with whom I am working in the frame of the two main Spanish Digital Newspapers libraries: Hemeroteca Digital at the National Library, and the Virtual Library of Historical Newspapers.
8. How was your research affected by the format of newspapers that you used, in both a positive and a negative sense?
Browsing through paper copies might accidently lead you to other information (eg. notes on the side of the newspaper, stories in an edition that is located in close physical proximity to the issue you originally were looking through), while being able to text mine might allow you to make connections across a far larger corpus of work than would be possible with paper copies.
You cannot browse through paper copies from page to page, from newspaper to newspaper, looking for some allusions to one concrete author or work. Of course you will find hundreds of papers in books and reviews and hundreds of communications on the presence of one single author or one very specific question in a certain newspaper, but is it representative enough? For the old scientific paradigm, perhaps; but not for the new one, because our experimental field is larger and larger. To speak as Popper does: nowadays it is a question of falsifiability.
9. If you could choose today between using a digital or a paper archive, which would you choose and why?
Of course, I would choose a digital archive, because:
- I am saving time and costs;
- I can handle a huge amount of data understandable by machines.
I’d rather a paper archive if and only if the newspapers are not digitised and if (and only if) I consider they are very important for my research. This raises a central question: it is a fact that 30% of heritage collections will not be digitised (I am quoting the 2nd Survey elaborated and published by Enumerate). Why? What about them? Which are the criteria used to select the archive (and I am quoting Michel Foucault) , to create the new digital archives? Who is making these (informed) decisions?
Researchers will not trust fully digital libraries because 30% of the content will not be digitised.
10. Looking forward, how would you improve access to historic newspapers? Are there specific tools that need to be provided, or needs that should be met by libraries and digital archives?
First of all, I advocate for an organisational change; a more user-oriented scope in order to bridge the gap between providers and scholars. Some initiatives such as this one are great, but also:
- Please make the environment more friendly: where is the information desk? Where is the contact in Europeana? Are there any Advisory Boards for scholars, researchers, students? Any menu with a clear list of services to help/teach us?
- A user-friendly means of searching across multiple European newspapers, including a simple annotation tool so that users can add comments or clarifications to items and data.
- A training section with services for research and learning communities, tools available for researchers and learners, even a selection of good practices.
- A kind of virtual Lab for researchers (such as in the British Library in the UK or in the Netherlands).
- Unrestricted access, rather than more money for tools, and ways to feed your raw data into my database.
In addition, you have to improve the quality of digital OCR. It could also be useful to have a centralised index of newspapers for every digital newspaper library which has aggregated data and material to Europeana: How many newspapers have been digitised and, the most important point, what were the criteria to choose these selected newspapers? Why these ones and not those ones?
Without this information, a real pan-European comparison is not possible. We have to be sure that our corpus (the newspapers available, the selected newspapers) is representative enough and balanced enough (with regard to a particular target designated by the researcher) to be analysed automatically.
Finally, it is a time to move from conservation to conversation. Please have more conversations such as this one, a nice initiative.
What potential do you see for a pan-European archive such as the one being built by Europeana Newspapers? Could you, for example, extend your thesis by having access to newspapers from across Europe via a single website?
Having access to newspapers from across Europe means we are able to go beyond binarist comparisons (and the logic of antagonisms) in favour of a more plural scope: marginal, peripherical and central; small and big cultures. It allows us to look for circulation, not only for origins: to study routes rather than roots, to work on what we call transliteratures.
Also, the possibility of managing a great amount of data with machines could change completely our perception of the importance of authors, works, concepts: Mme Cottin vs Mme de Grafigny. We can quantify their presence all over Europe and make a qualitative analysis.