Sustainability of Digital Formats
 Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

PDF (Portable Document Format)

>> Back
Table of Contents
Identification and description
Local use
Sustainability factors
Quality and functionality factors
File type signifiers
Notes
Format specifications
Useful references
Format Description Properties
• ID: fdd000030
• Short name: PDF
• Content categories: text, still image
• Format category: file format, bitstream encoding, wrapper
• Last significant update: 2004-06-03

Identification and description Explanation of format description terms

Full namePDF (Portable Document Format)
Description PDF (Portable Document Format), developed by Adobe Systems Incorporated, is described by Adobe as a general document representation language. PDF represents formatted, page-oriented documents. These documents may be structured or simple. They may contain text, images, graphics, and other multimedia content, such as video and audio. There is support for annotations, metadata, hypertext links, and bookmarks.
Production phase In general, a final-state format for delivery to end users.
Relationship to other formats 
  Has subtypePDF_1_4
  Has subtypePDF_1_5
  Has subtypePDF/X
  Has subtypePDF/A

Local use Explanation of format description terms

LC experience or existing holdingsUsed as service format for some digitized historical materials, primarily to support convenient printing.
LC preferenceRestricted subtypes are suggested as preferred formats for certain categories of content. The proposed PDF/A format is suggested as preferred when layout and visual characteristics are more significant than logical structure of text documents. PDF/X may be appropriate if used by creator or publisher during production.

Sustainability factors Explanation of format description terms

DisclosureFully documented. PDF was developed by Adobe Systems Incorporated, which makes the specification available openly and at no charge. One subtype of this proprietary format has been adopted as an international standard by ISO (PDF/X). A second is in the standardization process (PDF/A).
  DocumentationAdobe provides documentation for versions 1.3 to 1.5 at http://partners.adobe.com/asn/tech/pdf/specifications.jsp.
AdoptionExtremely widely adopted as a platform-independent format for disseminating page-oriented documents. Adobe Reader software for viewing PDF files is freely distributed and bundled with most personal computers.
  Licensing and patent claims Adobe has a number of patents covering technology that is disclosed in the Portable Document Format (PDF) Specification, version 1.3 and later.

A summary of information on the Adobe Web site in May 2004 (see http://partners.adobe.com/asn/developer/legalnotices.jsp) follows.
To promote the use of PDF for information interchange the following patents are licensed by Adobe on a royalty-free, non-exclusive basis for the term of each patent for developing software that produces, consumes, and interprets PDF files : 5,634,064 (filed 1996-08-02, granted 1997-05-27); 5,737,599 (filed 1995-12-07, granted 1998-04-07); 5,781,785 (filed 1995-09-26, granted 1998-07-14); 5,819,301 (filed 1997-09-09, granted 1998-10-06); 6,028,583 (filed 1998-01-16, granted 2002-02-22); 6,289,364 (filed 1997-12-22, granted 2001-09-11); 6,421,460 (filed 1999-05-06, granted 2002-07-16). Patent 5,860,074 (filed 1997-08-14, granted 1999-01-12) is similarly licensed on a royalty-free, non-exclusive basis for its term but only for the purpose of developing software that produces PDF files (specifically excluding software that consumes and/or interprets PDF files).

Adobe Reader displays additional patent numbers on launch.
TransparencyDepends upon compliant software tools to read. Building tools requires sophistication.
Self-documentationLater versions of PDF can include XMP metadata packages.
External dependenciesFaithful rendering requires that fonts be embedded.
Technical protection considerationsThe PDF format offers several forms of technical protection, including encryption, that would prevent custodians of digital content ensuring accessibility in future technological environments.

Quality and functionality factorsExplanation of format description terms

Normal rendering for textGood support is possible, but not guaranteed. The PDF format allow creators to disallow printing and extraction of text for quotations. PDF can also be used to create documents from scanned page images; such files do not necessarily support indexing of the document text.
Integrity of structureThe logical structure of a document is only represented in a PDF file if the creator or process during creation takes steps to incorporate structural tagging.
Integrity of layoutPDF is designed to represent the layout of page-oriented documents.
Integrity of rendering of equations, etc.Can be represented by embedded graphics.
Beyond normal renderingSupports embedding of media objects (in binary format) and links to external media objects, such as images, audio, or video.

File type signifiers Explanation of format description terms

Tag typeValueNote
Filename Extensionpdf 
Internet Media Typeapplication/pdfFrom LC web server configuration (Apache) of 2004-04-28. Registered with IANA (see Application Media-Types) and described in IETF (Internet Engineering Task Force) RFC 3778. Reported for PDF files by JHOVE PDF-hul module for file identification.
Internet Media Typeapplication/x-pdf
application/acrobat
application/vnd.pdf
text/pdf
text/x-pdf
Selected media types listed at The File Extension Source.
Magic numbersHex: 25 50 44 46
ASCII: %PDF
From Gary Kessler's File Signatures Table.

Notes Explanation of format description terms

General 
HistoryAdapted from PDF Reference, Third Edition:   The origins of PDF and the Adobe Acrobat product family date to early 1990. At that time, the PostScript page description language was rapidly becoming the worldwide standard for the production of the printed page. PDF builds on the PostScript page description language by layering a document structure and interactive navigation features on PostScript's underlying imaging model, providing a convenient, efficient mechanism enabling documents to be reliably viewed and printed anywhere.

Format specifications Explanation of format description terms

URLs
Adobe PDF Specifications (http://partners.adobe.com/public/developer/pdf/index_reference.html)
PDF Reference, Fourth Edition, Version 1.5 (http://partners.adobe.com/public/developer/en/pdf/PDFReference15_v6.pdf)
PDF Reference, Fifth Edition, Version 1.6 (http://partners.adobe.com/public/developer/en/pdf/PDFReference16.pdf)

Print
• Adobe Systems Incorporated. PDF Reference, Third Edition, Version 1.4. Addison-Wesley, 2001. ISBN 0-201-75839-3. Also available online as http://partners.adobe.com/public/developer/en/pdf/PDFReference.pdf.
• Adobe Systems Incorporated. PDF Reference, Second Edition, Version 1.3. Addison-Wesley, 2000. ISBN 0-201-61588-6. Also available online as http://partners.adobe.com/public/developer/en/pdf/PDFReference13.pdf.


Useful references

URLs
JHOVE PDF-hul Module (http://hul.harvard.edu/jhove/pdf-hul.html)
The application/pdf Media Type (http://www.rfc-editor.org/rfc/rfc3778.txt)


Last Updated: 03/ 7/2007