Sustainability of Digital Formats
 Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

NCBIArch_1, NCBI/NLM Journal Archiving and Interchange Document Type Definition, Version 1

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name NCBI/NLM Journal Archiving and Interchange Document Type Definition (DTD), version 1.0 and 1.1
Description Developed by the National Center for Biotechnology Information (NCBI), a division of the National Library of Medicine (NLM). This DTD, intended as a common format in which publishers and archives can exchange final journal content, is constructed from the November 2003 version of the Archiving and Interchange Tagset (version 1.1) of the modular Journal Archiving and Interchange DTD Suite.
Production phase Generally used for exchanging works in their final state. A related Publishing DTD (built from the same tag set) is intended as an initial- or middle-state format for authors and publishers.
Relationship to other formats
    Defined via XML_DTD,
    Has later version NCBIArch_2, Journal Archiving and Interchange DTD, version 2

Local use Explanation of format description terms

LC experience or existing holdings None as of January 2006. Plans for 2006 include a pilot activity with journal publishers on archival deposit; content in at least one version of the NCBI Archiving format will be part of the pilot.
LC preference Among the proposed preferred XML-based formats for textual works. On April 19, 2006, the Library of Congress and the British Library jointly announced support and advocacy for the NLM Archival DTD. To quote from the press release the two institutions "will work with the National Library of Medicine to ensure the open and transparent evolution of the NLM DTD standard by encouraging early adoption by an internationally recognized standards body."

Sustainability factors Explanation of format description terms

Disclosure Openly documented. All components of the Journal Archiving and Interchange DTD Suite are in the public domain.
    Documentation Archiving and Interchange DTD (http://dtd.nlm.nih.gov/)
Archiving and Interchange DTD Version 1.0 (http://dtd.nlm.nih.gov/1.0/)
Archiving and Interchange DTD Version 1.1 (http://dtd.nlm.nih.gov/1.1/)
Adoption TBD
    Licensing and patents None
Transparency Rates highly for transparency. Text content for articles is in XML, and hence viewable in basic editors, web browsers, etc. Elements have understandable tag-names, and document instances are in natural reading order.
Self-documentation The DTD includes a rich set of elements for metadata at the article and journal level. The <article> element is expected to include the article content and full descriptive metadata.
External dependencies None.
Technical protection considerations None.

Quality and functionality factors Explanation of format description terms

Text
Normal rendering Good support.
Integrity of document structure The logical structure of a document is an essential feature of the NCBI Archiving DTD.
Integrity of layout and display The intent is to “preserve the intellectual content of journals independent of the form in which that content was originally delivered”.
Support for mathematics, formulae, etc. MathML can be embedded. Integrity of rendering is constrained by the capabilities of MathML and rendering tools. Support for embedded CML (for chemical structures) is not incorporated into this version of the DTD, but may be in the future.

File type signifiers Explanation of format description terms

Tag Value Note
Filename extension xml
For textual content files.

Notes Explanation of format description terms

General The DTD was created from the Journal Archiving and Interchange DTD Suite, which provides a set of XML modules that define elements and attributes for describing the textual and graphical content of journal articles as well as some non-article material such as letters, editorials, and book and product reviews). The intent of this DTD Suite is to preserve the intellectual content of journals independent of the form in which that content was originally delivered. The Suite has been written as a set of XML DTD modules, each of which is a separate physical file. No module is an entire DTD by itself, but these modules can be combined into a number of different DTDs.
History The NLM DTD has its origins in the convergence of work NLM was doing to create an archiving DTD for Life Sciences journals and a project at Harvard University funded by the Mellon Foundation to address the problems of archiving scholarly journals in electronic form (E-journals). See Cover Pages on Harvard University E-Journal Archive Project. The second phase of the Harvard E-journal Archiving project was a feasibility study to investigate the development of a common markup formalism that can be used to "reasonably represent the intellectual content (text, tables, formulas, still images, and links) of archived journal articles." The study was carried out by Inera Corporation, using input from ten publishers who were asked to provide their existing DTDs, documentation and sample SGML documents for analysis.

Version 1.0 of the NLM DTD was released in December 2002. Version 1.1 was released in November 2003. Some users converting articles tagged according to other DTDs into Archiving and Interchange articles found that they had to lose information (such as semantic identification of some sections) in the transformation. The changes for version 1.1 were largely to allow preservation of such information. Version 2, involving substantial structural changes was released 2004-12-30. The changes are fully backwards compatible, but customizations based on 1.0 or 1.1 may not be compatible with 2.0. To keep the DTD relevant to the publishing and archiving communities, NLM has created the XML Interchange Structure Working Group. This group advises NLM on recommended changes in and/or additions to the tagset. NLM has contracted with Mulberry Technologies, Inc. of Rockville, MD to act as Archiving and Interchange Tagset Secretariat.


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 03/09/2009