Tag Suite Home

Introduction

The Journal Archiving and Interchange Tag Set defines elements and attributes that describe the content and metadata of journal articles, including research and non-research articles, letters, editorials, and book and product reviews. The Tag Set allows for descriptions of the full article content or just the article header metadata.

The intent of the Archiving Tag Set is to preserve the intellectual content of journals independent of the form in which that content was originally delivered. This Tag Set enables an archive to capture structural and semantic components of existing material without modeling any particular sequence or textual format.

It was planned that Archiving could be used for conversion from a variety of journal source Tag Sets, with the intent of providing a single format:

In order to enable description of the content used by the wide array of publishers, repositories, aggregators, etc., the Tag Set uses many loose structures, including some elements with nearly all content structures optional. Many attribute values in the Tag Set are data character values, accommodating any source values. Because some article components are prescriptive in nature, article metadata for example, the Archiving Tag Set includes a few completely generic structures for capturing semantic tagging that is not available natively in the Tag Suite. Although publication order cannot always be preserved, particularly within the metadata, the Archiving Tag Set works harder than any of the other NLM Tag Sets in this Suite to allow almost any publication arrangement and to allow retagging as renaming without rearrangement during conversion.

The Archiving Tag Set has a distinct focus on conversion from multiple sources. That focus has made this Tag Set a large and inclusive one. Many elements have been created explicitly so that information tagged by publishers would not be discarded when they converted material from another Tag Set to this one (or one created from this Suite). Care has also been taken to provide several mechanisms (frequently, information classing attributes) to preserve the intellectual content of a document structure when that structure is converted from another Tag Set or schema to this one, even when there is no exact element equivalent of the structure.

The exact replication of the look and feel of any particular journal has not been a consideration. Therefore, many purely formatting mechanisms have not been included. At the same time, Archiving is intended to preserve observed content, without resorting to stylesheets or generation of textual elements. For that reason, labels, numbers, and symbols of tables, figures, sidebars, and the like can be recorded as elements, as can the punctuation and spaces inside bibliographic references and lists.

Documentation

The complete documentation for this Tag Set is available in the Tag Library http://dtd.nlm.nih.gov/archiving/tag-library/. The structure and suggested usage of the Tag Library is described in the How to Use (Read Me First) section.

Frequently Asked Questions

A Frequently Asked Questions page is available.

Available Schemas

In addition to the DTD format, the Tag Set is also available as a W3C XML schema and as a RELAX NG schema. Both are generated directly from the DTD and neither is intended for maintenance. See the individual schema pages for more information.

Getting the Files

All of the Tag Set files are available by anonymous FTP: ftp://ftp.ncbi.nih.gov/pub/archive_dtd/archiving/

The files are directly available through the following links:

Each schema is also available through the web at the following stable URIs. Please note that not all browsers will display these files properly, but the files are viewable in XML or text editors.

Versions

The current version of the Archiving and Interchange Tag Set is v3.0.

Version 3.0 was released on November 21, 2008. A detailed explanation of the changes from version 2.3 is available in the v3.0 Change Report.

Version 2.3 is available here: http://dtd.nlm.nih.gov/archiving/2.3/

Version 2.2 is available here: http://dtd.nlm.nih.gov/archiving/2.2/

Version 2.1 is available here: http://dtd.nlm.nih.gov/archiving/2.1/

Version 2.0 is available here: http://dtd.nlm.nih.gov/archiving/2.0/

Version 1.1 is available here: http://dtd.nlm.nih.gov/archiving/1.1/

Version 1.0 is available here: http://dtd.nlm.nih.gov/archiving/1.0/

Feedback

Please submit all questions or comments to archive-dtd@ncbi.nlm.nih.gov.

This is a public mailing list. More information on the list is available: http://www.ncbi.nlm.nih.gov/mailman/listinfo/archive-dtd.

Any suggestions for changes to the Tag Set or documentation should be made through the Journal Article Tag Set Comment Form at the Mulberry Technologies site.

Related Tag Sets

The Journal Publishing Tag Set, created from the Tag Suite, is more prescriptive than the Archiving Tag Set. It is optimized for use by publishers and archives interested in regularizing their data.

The Article Authoring Tag Set, also created from the Tag Suite, is optimized for authoring original journal articles. It is the most limited Tag Set derived from the Suite that is offered by NLM.

The NCBI Book Tag Set was designed to accommodate tagging for books as part of the NCBI Bookshelf project.

Individuals wanting to submit citations and abstracts for inclusion in PubMed/MEDLINE should use the PubMed Journal Article DTD. See the Information for Publishers re: XML Tagged Data on the PubMed web site.

Tools

All of the tools described in this section are available at http://dtd.nlm.nih.gov/tools/.

NLM has created an XSL transform that converts data in any version of the Archiving Tag Set into version 3.0 of the Tag Set. Information about customizing and using the transform is available in the transform’s documentation.

Tools for Previous Versions

The following tools work only with versions of the Tag Set prior to 3.0. NLM will release versions of these tools to work with 3.0 as they become available.

XML Information

Links to general information on XML, XSLT, Unicode™, and XLink are available on the XML Resources page.




National Center for Biotechnology Information
U.S. National Library of Medicine
8600 Rockville Pike, Bethesda, MD 20894
Copyright, Disclaimer, Privacy, Accessibility

U.S. National Institutes of HealthU.S. Department of Health and Human ServicesUSA.gov


Last updated: November 21, 2008