PROPOSAL NO: 97-8

DATE: May 1, 1997
REVISED:

NAME: Redefinition of subfield $q (File transfer mode) in field 856 of the USMARC formats

SOURCE: OCLC Metadata Workshops; Library of Congress

SUMMARY: This paper proposes a redefinition of subfield $q (File transfer mode) to Electronic format type. This would result in recording the type of file, or MIME type in subfield $q, instead of the current definition that requires recording "ASCII" or "binary" to indicate what mode of transfer is necessary.

KEYWORDS: Field 856 (Bibliographic/Holdings/Classification/Community Information); Electronic Location and Access; Subfield $q, in field 856 [Bibliographic/Holdings/Classification/ Community Information]; File transfer mode; Electronic format type

RELATED: 96-1 (January 1996); DP99 (February 1997)

STATUS/COMMENTS:

5/1/97 - Forwarded to USMARC Advisory Group for discussion at the 1997 Annual MARBI meetings.

6/29/97 - Results of USMARC Advisory Group discussion - Approved. An error in the original proposal was corrected to state that subfield $q would remain non- repeatable.

8/21/97 - Result of final LC review - Approved.


PROPOSAL NO. 97-8: Redefinition of Subfield $q (File transfer mode) in field 856 

1.      BACKGROUND

Field 856 (Electronic Location and Access) was initially developed
and approved by the USMARC Advisory Group in January 1993.  At that
time, the Internet Engineering Task Force was finalizing the draft
standard for a locator, the Uniform Resource Locator (URL).  During
discussions of field 856, participants agreed that the field should
enable a system to create a "hot link" to allow for the transfer of
a file, the connection to another host, or the initiation of an
email message through information recorded in the field.  If the
resource described in the record was available by telnet, the
information should enable a connection; if the resource was
available by email, it should enable the initiation of an email
message; if by FTP, it should enable the transfer of a file.  One
piece of information that was deemed by participants to be required
for FTP was whether the file is transferred as ASCII or binary. 
Thus subfield $q was defined as File transfer mode.

Proposal No. 96-1 (Changes to Field 856 (Electronic Location and
Access) in the USMARC Formats) was presented to the USMARC Advisory
Group for discussion at the January 1996 MARBI meetings.  One
change proposed was to change the definition of subfield $q to
include information on file format type, since it had not been used
widely and the information included in it is related to file
transfer mode.  OCLC confirmed that the subfield had been rarely or
never used in the INTERCAT database.  However, the proposal was not
approved for the following reasons:  1) concern was expressed that
the need for file format to be explicit may be a temporary
situation, and that in the future files may become more self-
defining; 2) it was suggested that it would be better to wait and
see if this change is still needed in the future, since no specific
need had been demonstrated.


2.      DISCUSSION

In the past few years, the availability of all types of resources
over the Internet has exploded.  Now, the World Wide Web, which was
only under development when enhancements were made to MARC to
accommodate description of Internet resources, has allowed for the
integration of multimedia resources.  Software that is necessary
for display of digitized images or playing of digital audio files
is activated depending upon the file format.  Often the file
extension indicates the type of file and determines whether it is
transferred in binary or ASCII mode (ASCII is the default; all
other types of files are transferred using binary).  The
specification of whether a file is transferred as ASCII or binary
was not included in the URL standard; such information may be
assumed from the file extension.

In creating MARC records for Internet resources, catalogers have
been confused about where to include information about file format. 
Field 516 (Type of file) is a note field containing nature and
scope information about the file described.  In some cases this
information has been combined in the field with file format (e.g.
"Electronic journal in ASCII format").  In other cases, field 538
(System Details Note) has been used, since requirements for
processing the file are dependent upon the type of compression used
or file format type.  

File format is a data element included in the Dublin Core, a list
of core data elements needed for Internet resource discovery and
retrieval.  This list was developed by a wide range of participants
at four different workshops convened between March 1995 and March
1997.  It is also an element in the Government Information Service
(GILS) profile as Available Linkage Type, which is a subelement of
Available Linkage (which maps to field 856).  In the mapping of the
Dublin Core elements to MARC, field 538 was used for Format (see
Discussion Paper No. 99: (Metadata, Dublin Core and USMARC: a
review of current efforts).  However, this mapping is not entirely
adequate, since field 538 is a note that can contain information
other than file format, and since file format has also been
recorded in other MARC fields.  A revised version of the mapping,
which looked at equivalent data elements in GILS and included them
in the Dublin Core mapping, was recently made available and maps
the element to field 856$q, for lack of a better equivalent data
element (http://www.loc.gov/marc/dccross.html).

If a subfield were defined in field 856 for file format, then the
information could be given at the level of the location, rather
than for the intellectual work as a whole.  Using field 538 does
not relate the file format type to the location. In recent
discussions of whether separate records need to be created for
different file formats, the majority of respondents have endorsed
using one record for the intellectual work and to use repeating 856
fields for different file formats.  Recording such information
within field 856 would allow for the file format to be associated
with a particular file at a particular host.  However, other note
fields would still be available for recording file format if this
were desirable.  The term "electronic format" is suggested, since
the term "file" might not be appropriate in all situations.

Electronic format type is often referred to as "Internet Media
Type" (IMT), a new name for "MIME type".  An Internet Request for
Comments (RFC2046) entitled "Multipurpose Internet Mail Extensions
(MIME) Part II: Media Types", (replacing RFC1521 "MIME 
(Multipurpose Internet Mail Extensions))" defines the general
structure of the MIME media typing system and defines an initial
set of media types.  It includes content types and subtypes and
uses the Internet Assigned Numbers Authority (IANA) as a central
registry for specific values. Another document, RFC2048, specifies
IANA registration procedures for media types.

The National Digital Library Federation's Making of America II
Project formed a task group to look at architecture issues in its
digitization project.  The group identified a need to indicate
electronic format type in the metadata provided for complex digital
objects.  In addition, when Uniform Resource Names (URNs) rather
than URL's are in more widespread use, the format type will no
longer be implicit in the file name recorded as part of the URL.

If subfield $q were redefined as Electronic format type, it may be
desirable to record the data in a standard form, using those file
format types registered with IANA.  However, the data should not be
restricted or controlled to using this form, since neither the
Dublin Core element set or the GILS profile require a standardized
use, and thus allow for free text.  Subfield $q should not be
repeatable, since it is difficult to envision a situation where all
the other information in the field would apply (especially the URL)
to different electronic file formats.  If more detailed information
is needed to accommodate multiple formats, this information would
be given in subfield $z (such as electronic journals available by
email subscription that make more than one format available). 
Subfield $q might be used by a system as a clue to how it deals
with the object, and it would only be confusing to repeat the
subfield.  Thus, repetition of the electronic format information
will require a repeatable 856 field.
        Example: 
        130 0#$aEmerging infectious diseases (Online)
        245 00$aEmerging infectious diseases$b[computer file]
        260 ##$aAtlanta, GA$bNational Center for Infectious
        Diseases$bCenters for Disease Control and
        Prevention,$c[1995-
        516 8#$aASCII, Acrobat, and PostScript file formats
        530 ##$aOnline version of: Emerging infectious diseases
        (Print).
        776 1#$tEmerging infectious diseases (print)$x1080-
        6040$w(DLC) 96648093 $2 (OCoLC) 31848353
        856 00$umailto:lists#list.cdc.gov$isubscribe$fEIF-*$zInclude
        desired file format following the hyphen in the filename:
        EID-ASCII, EID-PDF, or EID-PS
        856 10$aftp.cdc.gov$dpub/EID$lanonymous$zEach issue is in a
        separate subdirectory (e.g. vol1no1).  There are additional
        subdirectories for each file format
        856 40$uhttp://www.cdc.gov/ncidod/EID/eid.htm$qtext/html


4.      PROPOSED CHANGES

The following is presented for consideration:

        *       In the USMARC Bibliographic/Holdings/Classification/
                Community Information Formats, redefine subfield $q (File
                transfer mode) as Electronic format type and make it
                repeatable.

------------------------------------------------------------------
                           ATTACHMENT A
                  Examples of Internet Media Types

Following are examples of Internet Media Types (MIME) and their
subtypes that are registered with IANA (this is not a comprehensive
list).  They would be expressed as type/subtype, e.g.
application/msword; text/html.  For additional information see:
ftp://ftp.isi.edu/in-notes/iana/assignments/media-types

Types/Subtypes:                                 File extension (where available)

application/oda                                 oda
application/pdf                                 pdf
application/postscript                          ai eps ps       
application/octet-stream                        bin
application/x-powerpoint                        ppt
application/wordperfect5.1                      wp
application/zip                                 zip             

audio/basic                                     au snd          
audio/x-aiff                                    aif aiff aifc
audio/x-wav                                     wav             

image/gif                                       gif             
image/jpeg                                      jpeg jpg jpe jif
image/tiff                                      tiff tif        
image/x-portable-bitmap                         pbm             
image/x-cmu-raster                              ras

message/http
message/rfc822
message/news

model/iges
model/vrml

multipart/encrypted
multipart/mixed

text/html                                       html
text/plain                                      txt
text/x-sgml                                     sgml sgm

video/mpeg                                      mpeg mpg mpe    
video/quicktime                                 qt mov          


Go to:


Library of Congress
Library of Congress Help Desk (09/02/98)