DISCUSSION PAPER NO: 2003-DP01

DATE: Dec. 17, 2002
REVISED:

NAME: Data Elements for Article Level Description

SOURCE: California Digital Library

SUMMARY: This paper explores data elements in the MARC 21 bibliographic format that record citation information for the description of a journal article. Field 773 (Host Item Entry) allows for a link to the record for the journal, but the citation data elements are not parsed into separate subfields, but contained as free text in subfield $g. The paper discusses ways to facilitate automated linking from MARC 21 records, such as what is intended in the CrossRef project or developing Open URL standard.

KEYWORDS: Field 773 (BD); Host Item Entry (BD); Citation information (BD)

RELATED:

STATUS/COMMENTS:
12/17/02 -- Made available to the MARC 21 community for discussion.

01/25/03 - Results of the MARC Advisory Committee discussion - Participants generally felt that more detailed coding was needed to facilitate automated linking to journal article citation information. The inclusion of caption information to accompany chronology and enumeration data was discussed in length, however, participants generally agreed that information from field 773 subfield $g and/or the SICI would provide adequate caption information. Based on a straw vote, participants favored solutions 4.3 (Add One Subfield for Enumeration and First Page to Field 773) and 4.5 (Add a Variable Field that Parses the Data Elements). Another tag number must be chosen for solution 4.5, however, because field 363 has been reserved for "Trade Price" in Proposal 2002-DP14/09. A proposal will be presented during the annual 2003 meeting.


Discussion Paper 2003-DP01: Data Elements for Article Level Description

1. INTRODUCTION

Although few libraries create cataloging records for journal articles an increasing number of articles are being rendered in MARC format through interaction with library system portal functions, interlibrary loan, and linking to full text. The original cataloging of the articles is done outside of the library cataloging environment and is neither based on AACR2 rules nor MARC encoding. Translation of these records into MARC 21 generally makes use of the 773 field to store the data elements relating to the host journal, issue, date and pagination. These are stored in the subfield $g (Relationship Information) as a free-text string:

 

$gVol. 2, no. 2 (Feb. 1976), p. 195-230

Services making use of records for articles are likely to need this information separated into individual data elements. Many systems have written algorithms to parse the $g subfield to obtain the individual data elements but this is subject to error due to the free-text nature of the subfield. In some cases the original source of the bibliographic record does have this information described in separate data elements but that precision is lost when the record is translated to MARC 21

The Relationship data elements that are most common to article-level services are: volume, issue, start page, formatted date. This paper explores possible solutions, including modifying field 773 to carry these as machine-processible data elements, using the SICI, or defining a new variable-length field for the data.

2. Examples of Current Usage

Note that in the examples, the terms "volume" and "issue" are often used. It is certainly possible that other terms may be used, e.g. equivalents in other languages, other terms denoting parts, etc. More accurate names would be "1st level enumeration" and "2nd level enumeration", but discussions have indicated that many do not consider those labels understandable. Thus, the terms "volume" and "issue" should not be considered the term of enumeration on the piece, but as representative as 1st and 2nd level enumeration. Many A&I vendors use "v.", "n", etc. regardless of what is on the piece, even if the original is in another language.

2.1. Records in Abstracting and Indexing Databases

ProQuest Database:

Citation:

  The effect of local exhaust ventilation controls on dust exposures during concrete cutting and grinding activities; Gerry A Croteau; AIHA Journal, Fairfax; Jul/Aug 2002; Vol. 63, Iss. 4; pg. 458, 10 pgs

Download Format:

  The effect of local exhaust ventilation controls on dust exposures during concrete cutting and grinding activities
AIHA Journal
Fairfax
Jul/Aug 2002

Authors: Gerry A Croteau
Authors: Steven E Guffey
Authors: Mary Ellen Flanagan
Authors: Noah S Seixas
Volume: 63
Issue: 4
Pagination: 458-467
Page Count: 10
Text Word Count: 808
Document Type: Feature
Source Type: PERIODICAL
ISSN: 15298663

National Library of Medicine

Citation:

  Molecular cloning and characterization of Rosa hybrida dihydroflavonol 4- reductase gene.
Tanaka Y, Fukui Y, Fukuchi-Mizutani M, Holton TA, Higgins E, Kusumi T.
Plant Cell Physiol. 1995 Sep;36(6):1023-31

Download Format:

  TI - Molecular cloning and characterization of Rosa hybrida
dihydroflavonol 4-reductase gene.
AU - Tanaka Y
AU - Fukui Y
AU - Fukuchi-Mizutani M
AU - Holton TA
AU - Higgins E
AU - Kusumi T
TA - Plant Cell Physiol
DP - 1995 Sep
VI - 36
IP - 6
PG - 1023-31
AD - Institute for Fundamental Research, Suntory Ltd.,
Osaka, Japan.

[TI=title, AU=author, TA=title abbreviation , DP=date of publication, VI=volume, IP=issue part supplement, PG=pagination, AD=author affiliation]

Web of Science

Citation:

  Young BA, Lee CE, Daley KM
On a flap and a foot: Aerial locomotion in the "flying" gecko, Ptychozoon kuhli
J HERPETOL 36 (3): 412-418 SEP 2002

Download format:

  PT Journal
AU Young, BA
Lee, CE
Daley, KM
TI On a flap and a foot: Aerial locomotion in the "flying" gecko, Ptychozoon kuhli
SO JOURNAL OF HERPETOLOGY
DT Article
SN 0022-1511
PU SOC STUDY AMPHIBIANS REPTILES
BP 412
EP 418
PG 7
JI J. Herpetol.
PY 2002
PD SEP
VL 36
IS 3
GA 599HR
PI ST LOUIS

[PT=publication type, AU=author, TI=article title, SO=full source title, DT=document type, SN=issn, PU=publisher, BP=beginning page, EP=ending page, PG=page count, JI=ISO source title abbrevation, PY=publication year, PD=publication date, VL=volume, IS=issue, GA=ISI document delivery number, PI=publisher city]

Not all article databases display separate data elements, however:

Ovid Online:

Citation:

  Bhattacharyya, A K. Choudhury, B N. Chintaiah, P. Das, P. Studies on a probable correlation between thermal conductivity, kinetics of devitrification and changes in fiber radius of an aluminosilicate ceramic vitreous fiber on heat treatment [Journal Article] Ceramics International. v 28 n 7 2002. p 711-71

Download Format:

  VN - Ovid Technologies
DB - Compendex
AN - 02357061384
AU - Bhattacharyya, A K
AU - Choudhury, B N
AU - Chintaiah, P
AU - Das, P
IN - Res. and Devmt. Ctr. for Iron/Steel IISCO Burnpur Center, Burnpur 713325, W.B., India
TI - Studies on a probable correlation between thermal conductivity, kinetics of devitrification and changes in fiber radius of an aluminosilicate ceramic vitreous fiber on heat treatment
SO - Ceramics International. v 28 n 7 2002. p 711-717

[VN=vendor, DB=database, AN=accession number, AU=author, IN=institution, TI=title, SO=source]

2.2. Retrieval of Full Text

Providers of full text have often developed retrieval URLs that can be derived algorithmically from bibliographic records. Even though communication from the bibliographic database to the retrieval service may use other techniques, such as the OpenURL, the actual linking to the location of the full text document makes use of these algorithmically derived URLs.

For this article in Current Contents:

Feitkenhauer, H; Meyer, U. Intermediate accumulation and efficiency of anaerobic digestion treatment of surfactant (alcohol sulfate)-rich wastewater at increasing surfactant/biomass ratios.
JOURNAL OF CHEMICAL TECHNOLOGY AND BIOTECHNOLOGY, SEP, 2002,
V77(N9):979-988.

The Wiley article level URL is:

http://mddb.wiley.com/db/mdresolve.cgi?
issn=0268-2575&volume=77&issue=9&first_page=979

For this article in Expanded Academic Index:

Pennisi, Elizabeth Recharged field's rallying cry: gene chips for all organisms: seeds of microarray technology help comparative physiology bloom again. Science v297, n5589 (Sept 20, 2002):1985

The HighWire Press article level URL is:

http://www.sciencemag.org/cgi/content/full/297/5589/1985
where "/297/5589/1985" is /volume/issue/page.

For this article in Expanded Academic Index:

Koshland, Daniel E., Jr. Molecule of the year: the DNA repair enzyme. (research could improve environmental policy and human research models) Science v266, n5193 (Dec 23,1994):1925

The JSTOR article level URL is based on the SICI:

http://links.jstor.org/sici%3Fsici
%3D0036-8075%2819941223%29266%3A5193%3C1925%3AMOTYTD

issn date vol issue page
%3E2.0.CO%3B2-3

Removing the URL-encoding, this reads as:

  0036-8075(19941223)266:5193<1925:MOTYTD>2.0.CO;2-3

where "(19941223)266:5193<1925" is (formatted date)volume:issue<first page.

3. STANDARDS

There are standard formats for representation of journal articles that may interact with a MARC-formatted record. These standards carry abbreviated forms of the bibliographic data because their purpose is retrieval or linking rather than discovery or display.

Included here are CrossRef Query Format, the OpenURL, and the Serial Item and Contribution Identifier (SICI). These are derived from data encoded in MARC records, and they may be used to retrieve data from a MARC-encoded database. Compatibility with MARC data elements therefore can affect the success of the services that are using these standards.

3.1. CrossRef

  Data elements
    ISSN
Title
Abbrev
First author surname
Volume
Issue
Start Page
Year
Resource Type
Key
DOI
  Examples:
    1. 09666532|Ambulatory Surgery|Wetchler|9|2|57|2001|||
2. |J. Mol. Biol.|Pesce|274||408|1997||43|

3.2. OpenURL (v. 0.1)

  Data elements
    aulast - author last name
aufirst - author first name
auinit - author initials
issn - ISSN
eissn - electronic ISSN
coden - CODEN
isbn - ISBN
sici - SICI
bici - BICI
title - journal or book title
stitle - abbrev. title
atitle - article title
volume - volume
part - part
issue - issue
spage - start page
epage - end page
pages - page range
artnum - article number
date - formatted date
ssn - season
quarter - quarter
  Examples:
    http://sfxserver.uni.edu/sfxmenu?issn=1234-5678&date=1998&
volume=12&issue=2&spage=134

http://sfx1.exlibris-usa.com/demo?sid=ebsco:medline&
aulast=Moll&auinit=JR&date=2000-11-03&
stitle=J%20Biol%20Chem&volume=275&issue=44&spage=34826

3.3. SICI

  Data elements
    ISSN
Chronology - date in standard form
Enumeration - all levels of enumeration associated with a single issue
Supplements and Indexes - when issued as separate items
Location - generally first page number
Title Code - derived standard title representation
Locally Assigned Number - an identifier
  Example:
   

SICI v2: 0015-6914(19960101)157:1<62:KTSW>2.0.TX;2-F

4. Possible Solutions

The first three solutions listed below add formatted data elements to the MARC Linking Fields (or at least to field 773). The $g subfield remains in the fields to carry the unformatted Relationship information that is suited to display in bibliographic systems. It is also assumed that Date1 and Date2 in the 008 can store the detailed date information. In addition there are options for 1) using the SICI and 2) defining a new variable-length field to contain the data.

4.1. Add Three Subfields for: Volume, Issue, First Page

There are two subfields that have not been used in any of the fields in the Linking Fields range: $l, $q. Subfields $j (Period of content) and $v (Source contribution) have been used only in field 786 . Adding a separate subfield for each of the three needed data elements would exhaust the subfield codes available to be used across the range 76X-78X (and reuse one already defined as something different in 786).

Because there can be more levels of enumeration than volume and number, the subfield for number could contain all levels except the highest, which would be entered in the subfield for volume. Although imprecise, the linking services allow only two levels of enumeration and this appears to satisfy a significant number of linking situations. In some instances additional levels are included in the Issue data element although these would then need to be parsed by a receiving system.

Assuming that we use $v for volume, $l for number and $q for page, these examples follow:

773 0# $7 nnas $t Going Places. $g (July/Aug. 1984), p. 24-33 $q 24
773 0# $7 nnas $t California journal. $g Vol. 24, no. 9 (Sept. 1993), p. 235-48 $v 24 $l 9 $q 235
773 0# $7 nnas $t Metro. $g Vol. 96, no. 4 (May 2000), p. 23-24, 27 $v 96 $l 4 $q 23
773 0# $t Pacific rail news. $g No. 279 (Feb. 1987) p. GM5-GM6 $l 279 $q GM5
[Note that in this example the enumeration is "no." and recorded in $l; vendors seem to prefer using the designation of the piece rather than enumeration level.]

4.2. Add Two Subfields for: Enumeration, First Page

This solution would make use of the two remaining subfields only. The Enumeration would be formatted as it is in the SICI, allowing for any number of enumeration levels, and placed in a single subfield $l. The first page would be placed in the $q:

773 0# $7 nnas $t Going Places. $g (July/Aug. 1984), p. 24-33 $q 24
773 0# $7 nnas $t California journal. $g Vol. 24, pt. B no. 9 (Sept. 1993), p. 235-48 $l 24:B:9 $q 235
773 0# $7 nnas $t Metro. $g Vol. 96, no. 4 (May 2000), p. 23-24, 27 $l 96:4 $q 23
773 0# $t Pacific rail news. $g No. 279 (Feb. 1987) p. GM5-GM6 $l 279 $q GM5

4.3.Add One Subfield for Enumeration and First Page

A single subfield, $l, could carry the Enumeration and First Page data elements formatted as they are in the SICI.

773 0# $7 nnas $t Going Places. $g (July/Aug. 1984), p. 24-33 $l <24
773 0# $7 nnas $t California journal. $g Vol. 24, pt. B no. 9 (Sept. 1993), p. 235-48 $l 24:B:9<235
773 0# $7 nnas $t Metro. $g Vol. 96, no. 4 (May 2000), p. 23-24, 27 $l 96:4<23
773 0# $t Pacific rail news. $g No. 279 (Feb. 1987) p. GM5-GM6 $l 279<GM5

4.4. Require Use of the SICI

The SICI carries all of the needed data elements and can be entered into the 024 field. Subfield $g in the linking fields would continue to be used for display where a user-display function is needed. This alternative is clearly the one that requires least format change. However, a lot of resources do not have SICIs when published. If the SICI is assigned by the publisher, it is reliable, but if generated from the metadata, it may not be, but may be constructed from available bibliographic information. Thus, it may be ambiguous because SICIs are not always unique, since there may be more than one SICI for an article.

4.5. Add a Variable Field that Parses the Data Elements

A variable field modeled after the holdings 853/863 fields (but not as complex) could be added to give levels of enumeration. Both the caption and the enumeration would be recorded in the subfields. To simplify, levels recorded could be limited to three.

Field 363 Article level designation
$a 1st level enumeration
$b 2nd level enumeration
$c 3rd level enumeration
$p Page

Optionally, a subfield could be added for date (for the article itself), but this information would also be included in 008/07-14, so this is probably unnecessary.

Example:
773 0# $7 nnas $t California journal. $g Vol. 24, no. 9 (Sept. 1993), p. 235-48
363 ## $a 24 $b 9 $p 235-248
Encoding the data without the captions is preferable so that systems do not have to parse through the subfield to get the number. (In the URLs that many vendors use in the format of issn/volume/number/pagination, the volume and number are just numerics.) Alternatively, the captions could be included:
363 ## $a v.24 $b no.9 $i199309

Note that chronology may be the only enumeration. In those cases, it would be encoded in the enumeration subfields.



Go to:

Library of Congress Library of Congress
Library of Congress Help Desk ( 03/17/2003 )