The Diary of Horatio Nelson Taft

The Diary of Horatio Nelson Taft: Building the Digital Collection


Creating the Digital Images

The Diary of Horatio Nelson Taft was digitized with a Phase I scanner in the Information Technology Services Scan Center at the Library of Congress. The Phase I scanner has an overhead camera especially suited for digitizing fragile bound items. The three volumes were scanned as 300 dpi color images, which were compressed using JPEG compression, producing images in the JPEG File Interchange Format (JFIF). Because JPEG images require considerable time to download, GIF images were created for convenient access when using the National Digital Library Program (NDLP) page-turner feature.

Issues Affecting Scanning

Because efforts were made to preserve the look of the manuscripts, the digital images reflect their original physical condition. Due to their age, the volumes contain pages that are discolored or stained. Their digital images may therefore show discolorations and tonal variations in the paper. Despite its age, however, the diary is in relatively good condition and presented few major scanning concerns. The main difficulty during digitization was the volumes' tight binding. Scanners took great care in the handling of the collection and placed each volume in a Linhof book cradle during digitization. Such a cradle is designed especially for rare bound materials and protects their binding.

At the very back of Volume 2 is a genealogical essay on the Taft family written by Taft himself. This article was inscribed in a way designed to separate it from the rest of the volume's contents. Taft had turned the volume over and written the essay so that it has to be read in the opposite direction and upside down from the text in the rest of the volume. For readability, the pages were rotated during scanning.

Digitizing Text

The Library of Congress has provided transcriptions for all three volumes of the Diary of Horatio Nelson Taft. An NDLP staff member keyed them in Wordperfect, following formatting guidelines mapped to the Library's American Memory DTD and working under the supervision of John R. Sellers, Manuscript Division subject specialist for the Taft Diary. The word-processing files were saved as HTML and then converted to SGML in a text editor that employed regular expressions for pattern matching.


The Diary of Horatio Nelson Taft