Newspapers at Footnote.com
Mar 29th, 2009 by sharbrough
Searching Newspapers
Let’s assume that you already know HOW to search within a title or a group of titles on Footnote. When you search for something in a newspaper, here’s how it works.
Search at Footnote looks at two kinds of information:
- “Metadata”. Fields that were indexed digitally by Footnote can be searched using the way-cool searching and filtering functions. This is generally limited, in the case of newspapers, to the title, date, and place.
- “OCR data”. OCR is short for “optical character recognition.” I wish it produced perfect results – I wish that computers could read better than they do. A lot of OCR data is gibberish. But we do successfully read millions of words, and Footnote’s newspapers OCR is state of the art. They do old-fashioned fonts, they use several OCR schemes and take the best results, and for what they have to work with, the results are quite good. That said, OCR output is just a “text blob” for that page. It’s not broken down into fields like names, places, headlines, or even by article. You can still search this data, but only by using “keyword” search. Footnote does not routinely identify names in OCR text.
If you are looking for newspaper content, and you use any search filter other than keywords, you will find the results quite limited. On the other hand, if you want to look for the word “baseball” in the 8-April-1923 Chicago Tribune, you can do that quite easily. When the search results are displayed, the keywords are highlighted on the page – that’s pretty helpful as opposed to having to read the whole page to find your word.
When using Footnote’s viewer, if there is any OCR text for the image, you’ll see the “FIND” button in the viewer’s toolbar. That will show you any occurances of a word on that page, and is sometimes a really useful feature.
Summing it up
There are almost 3 million pages of newspapers on Footnote. They provide a great historical context for people and events. The OCR, search, and image quality are pretty impressive. If you haven’t tried them, you owe it to yourself to take 15 minutes and try it. It’ll cost you an hour, but that’s what Footnote does.
