Sustainability of Digital Formats
|
|
Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact |
Full name | NCBI/NLM Journal Archiving and Interchange Document Type Definition (DTD), version 1.0 and 1.1 |
---|---|
Description | Developed by the National Center for Biotechnology Information (NCBI), a division of the National Library of Medicine (NLM). This DTD, intended as a common format in which publishers and archives can exchange final journal content, is constructed from the November 2003 version of the Archiving and Interchange Tagset (version 1.1) of the modular Journal Archiving and Interchange DTD Suite. |
Production phase | Generally used for exchanging works in their final state. A related Publishing DTD (built from the same tag set) is intended as an initial- or middle-state format for authors and publishers. |
Relationship to other formats | |
Defined via | XML_DTD, |
Has later version | NCBIArch_2, Journal Archiving and Interchange DTD, version 2 |
LC experience or existing holdings | None as of January 2006. Plans for 2006 include a pilot activity with journal publishers on archival deposit; content in at least one version of the NCBI Archiving format will be part of the pilot. |
---|---|
LC preference | Among the proposed preferred XML-based formats for textual works. On April 19, 2006, the Library of Congress and the British Library jointly announced support and advocacy for the NLM Archival DTD. To quote from the press release the two institutions "will work with the National Library of Medicine to ensure the open and transparent evolution of the NLM DTD standard by encouraging early adoption by an internationally recognized standards body." |
Disclosure | Openly documented. All components of the Journal Archiving and Interchange DTD Suite are in the public domain. |
---|---|
Documentation | Archiving and Interchange DTD (http://dtd.nlm.nih.gov/) Archiving and Interchange DTD Version 1.0 (http://dtd.nlm.nih.gov/1.0/) Archiving and Interchange DTD Version 1.1 (http://dtd.nlm.nih.gov/1.1/) |
Adoption | TBD |
Licensing and patents | None |
Transparency | Rates highly for transparency. Text content for articles is in XML, and hence viewable in basic editors, web browsers, etc. Elements have understandable tag-names, and document instances are in natural reading order. |
Self-documentation | The DTD includes a rich set of elements for metadata at the article and journal level. The <article> element is expected to include the article content and full descriptive metadata. |
External dependencies | None. |
Technical protection considerations | None. |
Text | |
---|---|
Normal rendering | Good support. |
Integrity of document structure | The logical structure of a document is an essential feature of the NCBI Archiving DTD. |
Integrity of layout and display | The intent is to “preserve the intellectual content of journals independent of the form in which that content was originally delivered”. |
Support for mathematics, formulae, etc. | MathML can be embedded. Integrity of rendering is constrained by the capabilities of MathML and rendering tools. Support for embedded CML (for chemical structures) is not incorporated into this version of the DTD, but may be in the future. |
Tag | Value | Note |
---|---|---|
Filename extension | xml |
For textual content files. |
General | The DTD was created from the Journal Archiving and Interchange DTD Suite, which provides a set of XML modules that define elements and attributes for describing the textual and graphical content of journal articles as well as some non-article material such as letters, editorials, and book and product reviews). The intent of this DTD Suite is to preserve the intellectual content of journals independent of the form in which that content was originally delivered. The Suite has been written as a set of XML DTD modules, each of which is a separate physical file. No module is an entire DTD by itself, but these modules can be combined into a number of different DTDs. |
---|---|
History | The NLM DTD has its origins in the convergence of work NLM was doing to create an archiving DTD for Life Sciences journals and a project at Harvard University funded by the Mellon Foundation to address the problems of archiving scholarly journals in electronic form (E-journals). See Cover Pages on Harvard University E-Journal Archive Project. The second phase of the Harvard E-journal Archiving project was a feasibility study to investigate the development of a common markup formalism that can be used to "reasonably represent the intellectual content (text, tables, formulas, still images, and links) of archived journal articles." The study was carried out by Inera Corporation, using input from ten publishers who were asked to provide their existing DTDs, documentation and sample SGML documents for analysis. Version 1.0 of the NLM DTD was released in December 2002. Version 1.1 was released in November 2003. Some users converting articles tagged according to other DTDs into Archiving and Interchange articles found that they had to lose information (such as semantic identification of some sections) in the transformation. The changes for version 1.1 were largely to allow preservation of such information. Version 2, involving substantial structural changes was released 2004-12-30. The changes are fully backwards compatible, but customizations based on 1.0 or 1.1 may not be compatible with 2.0. To keep the DTD relevant to the publishing and archiving communities, NLM has created the XML Interchange Structure Working Group. This group advises NLM on recommended changes in and/or additions to the tagset. NLM has contracted with Mulberry Technologies, Inc. of Rockville, MD to act as Archiving and Interchange Tagset Secretariat. |
|