Numbers, part 1
Nov 20th, 2009 by sharbrough
Footnote publishes many numbers. For instance, the home page contains the total number of images – just above 60 million today. They also display the percentage that a title is complete, in some cases. When you view the “title index page” for a title, you will see the number of images published to date for that title.
Today, let’s discuss the percentage of completion.
For the most part, this is just what it sounds like. For a title from the National Archives, for example, there is a specific number of rolls of film. Sometimes the archives also provide Footnote with the total number of images included on those rolls. In other cases, Footnote estimates the number of images by multiplying the average number of images on an average roll of film, times the number of rolls. In most cases, this is a pretty reasonable number. Users could compare the completion percentage and the number of images published to date, and calculate an estimated total number of images.
If a user was interested in a given title, and the images that she wanted to find were not yet published, she might watch the number change, from week to week or month to month, waiting for that page to show up. There were internal discussions at Footnote in years past, about whether the completion percentage was enough information for users. Some argued for disclosing the order of publication – is it alphabetical by last name, or numerically by volume number, or alphabetically by county, or chronologically by year of immigration? Then users could more easily assess when the information that they awaited might be ready.
At this point, we have the completion percentage, which I think of as the “Dirty Harry” number. Do you feel lucky, punk? Absent any information about the order of publication, this number represents the odds that you’ll find the record you seek, and nothing more. For example, if you knew that they had already done your county, you could be confident that you had searched that title completely, without having to wait until Footnote finished publishing the other counties.
On the scale of irritations in life, this is a small one. Committed researchers find ways to deal with this, and the completion percentage is better than no information at all about the status of the title.
How reliable is this information, when Footnote publishes it? For the most part, it’s pretty reliable. But let’s view a couple of examples that could be more helpful.
The Navy Survivor’s Certificates, M1469, are published by NARA on 80,000 microfiche. Footnote has presently published almost 2 million pages, and says that they are 12% complete. This would imply that there are about 16 million images. I do not know the exact number. Most often, the number is found in a “descriptive pamphlet” published by NARA, but there is not one for this title. Most fiche have between 30 and 100 images on them, in my experience. The Footnote estimate appears to be based on averaging 200 images per fiche. I’m going to express some skepticism, and say that the title is likely to contain a smaller number of images than 16 million. I have no basis by my own limited experience to say that it would make more sense to me to learn that the number is 6 million pages. That would significantly impact the completion percentage, and the confidence that a user would have about whether they had searched successfully.
The “Widow’s Certificates” have never been microfilmed. There are about 165 million pages of these. Footnote is publishing them in partnership with NARA and FamilySearch, who provides the volunteers to scan these documents. FamilySearch estimates that this title contains “350 camera-years” of pages. Presently, 500,000 pages have been published. Footnote describes this as 1% complete. This is an unusual case – it’s the largest number of images attempted by Footnote by a wide margin, and rounding to the nearest percent is rounding to the nearest 1.5 million pages. A blunt instrument, indeed.
Here is a high-level look at some of those numbers.
Total titles: 509
100% complete: 95
99% complete: 22 – This generally reflects problems with a single badly-behaved roll of film.
90 – 98% complete: 13
51-90% complete: 12
12-50%: 6
0-12%: 6
Hmmm. That’s only about 160 titles.
Titles with no completion percentage: 354. These are primarily newspapers and city directories.
The titles which are 90% complete or greater represent a tremendous and unique collection of records for researchers. Those 130 titles are quite a library.
The titles between 0 and 90% complete represent a future opportunity for research, but are hit-or-miss for researchers in the interim.
The titles with no completion percentage should be considered complete or finished by the user. In the case of the newspapers, they are digitised and forwarded to Footnote by the original publisher, Small Town Papers. Footnote has no control over how many papers are sent, for what small towns, and for what years. Footnote might be able to make some helpful estimates on the City Directories.
On the whole, these numbers are as good as Footnote can make them, and can be used to help researchers determine whether the record that they seek is yet to be published, or perhaps not in that collection. Here are the ten titles which are 100% and have the most pages.
Top 10 100 percent
| Title | Pages |
|---|---|
| Civil War and Later Veterans Pension Index | 2,983,078 |
| Revolutionary War Service Records | 2,029,629 |
| Navy Widows' Certificates | 1,698,112 |
| Civil War Soldiers - Confederate - MS | 1,281,882 |
| Civil War Soldiers - Confederate - SC | 1,205,379 |
| Civil War Soldiers - Confederate - LA | 786,610 |
| Census - US Federal 1860 | 705,063 |
| Naturalization Index - NY Southern Intentions | 560,232 |
| Naturalization Index - NY Eastern Nov 1925-Dec 1957 | 550,838 |
| Civil War Soldiers - Confederate - Officers | 498,917 |
To view the list of titles, and their completion percentages, point your browser here. [link]
