DISCUSSION PAPER NO. 97

DATE: May 6, 1996
REVISED:

NAME: Coding Digital Items in Leader/06 (Type of Record) in the USMARC Bibliographic Format

SOURCE: Library of Congress

SUMMARY: This paper explores issues concerning coding the Leader/06 when the item being cataloged is in digital form. It reviews the use of the code in Leader/06 in library systems and its impact since the completion of format integration. A revised definition for code "m" (Computer file) is proposed to allow for more flexiblity in the creation and coding of records for digital items. Issues concerning digital reproductions are raised.

KEYWORDS: Leader/06 (Bibliographic); Type of record; Digital reproductions; Computer files

RELATED: DP92 (January 1996); 95-9 (June 1995)

STATUS/COMMENTS:

5/6/96 - Forwarded to USMARC Advisory Group for discussion at the July 1996 MARBI meetings.

7/6/96 - Results of USMARC Advisory Group discussion - Early in the discussion it was decided to focus on digital materials, not on other combinations of materials for which the distinction between content vs. carrier is relevant. The consensus was that it was desirable to change the definition of code m so that one does not have to code everything digital that way, although clear guidelines are needed. It was requested that a proposal be written for the Midwinter MARBI meetings with a redefinition of code "m" with the coding of Leader/06 for digital items dependent upon the content of the item, rather than how it is represented. Two options should be included: 1) a narrow definition that includes only executables; and 2) a broader definition that includes executables, data sets, and raw data that is not numeric. It was also recommended that the definition of code o (Kit) be clarified and an order of precedence established for deciding how to code.


DISCUSSION PAPER NO. 97: Coding Digital Items in 
Leader/06 (Type of Record) 

1.    BACKGROUND

      With the completion of the last phase of Format Integration in
early 1996 MARC bibliographic records may contain coding for more
than one set of characteristics in the new field 006 (Fixed-Length
Data Elements--Additional Material Characteristics).  Leader/06
(Type of record) contains a code that is used to determine what
type of 008 (Fixed-Length Data Elements) is included in the record;
the 008 character positions vary in 008/18-34 depending upon the
type of material as coded in the Leader.  Field 006 includes
applicable codes that would otherwise be coded in 008/18-34, so
that additional information may be given for other additional
aspects of the item.  A choice must be made as to which form of
material the field 008 should be coded for.  In terms of
description, the decision as to which form is primary and which is
secondary does not have much impact, since all characteristics can
also be given in the record.  However, the Leader/06 code is used
for many purposes, particularly for retrieval of records.  Format
Integration opens up the opportunity to supply more information
about the item than in the past, but it also brings up many
questions about how to apply this new flexibility.
      
      In our current environment, distinctions between types of
material have become blurred.  With the advent of the personal
computer and the growth of the Internet it becomes questionable
whether categorizing all digital material as a computer file is
useful for retrieval and manipulation of bibliographic records.  If
all digital material were coded as a computer file the record for
a computerized version of, for example, an original photograph will
be coded differently than the record for the original (if separate
records are created).  This may cause problems for retrieval,
particularly in systems that separate records by form of material. 
Also, because of economic considerations, many users are adding
information about the digital item on the MARC record for the
original, rather than creating a separate record.

      The coding in Leader/06 was discussed on two previous
occasions at meetings of the USMARC Advisory Group.  Proposal No.
95-9 (Encoding of Digital Maps in the USMARC Bibliographic Format)
was considered by the USMARC Advisory Group in June 1995.  It
proposed renaming code "e" in Leader/06 from "Printed map" to
"Cartographic material" so that all maps, whether digital or print,
could be coded the same (there is also a code for "manuscript
map").  Because of the increasing number of digital map images
becoming available (resulting partly from digital library projects
and the Content Standards for Geospatial Metadata), this change was
considered necessary for the map community.  In many cases the
bibliographic record for the paper copy will contain information
about the location of the digital image.  This paper brought up the
issue of coding for content rather than for physical carrier. 
Although the portion of the proposal concerning Leader/06 was
approved, it was suggested that a broader discussion paper be
presented.

      Discussion Paper No. 92 was presented to the USMARC Advisory
Group in January 1996.  It explored changing the definition of code
"m" in Leader/06 so that it is used only for executable software. 
Questions were raised concerning the use of the Leader/06 after
format integration with the availability of 006.  Participants
listed the many uses made of the code in Leader/06 by library
systems.  There was general agreement that in cases where the
content of the electronic material is clear, that identifying the
primary record type in the Leader/06 by its content rather than
carrier better served users.  These cases include electronic text,
music CDs, digital maps, digital photographs, etc.  Participants
felt that to define code "m" as only executable software was too
restrictive for various reasons: the growing existence of hybrids
which include pictures, graphics, text, software; files that don't
fit into a category, e.g. survey data; and, defining "m" only as
executable software would not allow input of an 006 for computer
file characteristics for electronic text, since its secondary
characteristic is not an executable.  The group thought that it was
likely that each constituency would need to issue guidelines.
Another discussion paper was requested for the USMARC Advisory
Group meeting in July.


2.    LEADER/06

Systems use the Leader/06 for various purposes:
      -     to separate databases based on form of material
      -     to determine workform displays for keying the 008
      -     to sort records
      -     to validate correct use of fixed fields
      -     to support boolean searching
      -     matching records for duplicate detection
      -     display of labels to identify fields
      -     to select subsets for products distributed
      -     to generate icons that show format when searching
            multiple databases

Because of the many uses of this coded element, the decision as to
which characteristic to consider primary has great impact for
retrieval and manipulation of the record.  The 006 can give the
additional descriptive information for the secondary form of
material, but most systems are not currently using it for retrieval
the same way as 008 is used.

The current definition of Leader/06 in the USMARC Format for
Bibliographic Data, code "m" for Computer file is:

      m - Computer file
      Code m indicates that the content of the record is for a body
      of information encoded in a manner which allows it to be
      processed by a computer.  The information in the computer file
      may be numeric or textual data, computer software, or a
      combination of these types.  Although a file may be stored on
      a variety of media (such as magnetic tape or disk, punched
      cards, or optical character recognition font documents), the
      file itself is independent of the medium on which it is
      stored. 

This definition implies that any digitized item needs to be
identified in the Leader/06 as a Computer file. It also implies
that a separate record would need to be made for a digitized
reproduction, since the record for the original would be coded
according to the original carrier/content.  For systems that
rigidly separate records by leader/06 values this could be a
problem, since the user may not be able to bring together the
record for the original with the record for the reproduction.
Opening up this definition so that the institution is not mandated
to categorize all digitized items as computer files allows for more
flexibility.



3.    TYPES OF DIGITIZED ITEMS
  
For purposes of record creation, digitized items may fall into
several different categories:

      Item exists only in digitized form.  This would apply to
      computer software, databases, Internet resources, CD-ROMs with
      software, images, text, etc., a journal issued only
      electronically, etc.
   
      Item issued simultaneously in more than one form, one of which
      is digital.  This would mainly apply to textual items that
      might be published in printed form and in digital form at the
      same time.

      Digitized reproductions.  This would apply to items that were
      digitized from the original and are intended to substitute for
      it.  The content of the item is essentially the same, although
      there may be some differences because of the technology of
      digitizing.

      A portion of an item is digitized.  An example might be an
      item that is in print form, but a portion of it, such as its
      table of contents, has been made available digitally.

      A digitized collection that exists only digitally.  Specific
      items might be digitized and put together as a collection with
      some unifying elements.  In this case the items do not exist
      together as a collection in their original form, but only in
      the digital form.

With the definition of Field 856 (Electronic Location and Access),
a bibliographic record can be created with a link to the electronic
location of the item. Subfield $3 (Materials specified) allows for
electronic location information to be given for a subset of the
item in the record.  Electronic location and access information was
assigned to a field in the holdings block because it was considered
equivalent to location information in field 852 (Location) (but for
electronic location).  Thus, as with field 852, it could be
embedded in the bibliographic record to give copy specific
information or theoretically communicated in a separate holdings
record.  So far it appears that most uses of field 856 have been in
the bibliographic record to point to the electronic location.  

Technological advances have resulted in numerous projects to
digitize existing material.  Librarians want to provide descriptive
information and access to these materials through catalog records. 
In special collections, particularly in visual materials,
institutions have often chosen to include field 856 in the record
for the original to give information about the digitized item.  In
these cases, additional descriptive information about the item in
its digitized form has not been needed, but only information about
the location and access to it.  This technique has been attractive
because of the retrieval problems when the digitized item is
cataloged separately as a computer file, and the economic
considerations of creating a separate record.  It allows for the
focus of the record to be on the content of the item, rather than
its carrier.

4.    CODING FOR CONTENT

      Coding for the content of the item for digitized materials
would be consistent with the method used for handling microforms. 
The USMARC bibliographic format in Leader/06 says the following:


      "Microforms, whether original or reproductions, are not
      identified by a distinctive Type of record code.  The type of
      material characteristics described by the codes take
      precedence over the microform characteristics of the item."
There is not a separate Leader/06 code for microform, although
there is one for computer file.

      By handling digitized items in Leader/06 as language,
cartographic, music, etc., records for digital reproductions would
not be separated from the originals, and allows flexibility for
record creation (i.e., using one record and adding a field 856 with
location information of the digital reproduction or creating a
separate record).  The statement above about the treatment of
microforms could be revised to include digitized materials.

      Additional information about the computer file aspects can be
given in an 006 for computer file fixed field data and in 007 for
physical description.  A general material designator (GMD) would be
given in 245$h to indicate that the physical format is computer
file; it is not necessary for the 008 to agree with the GMD, so the
008 for the content of the item would be given.

5.    REVISED DEFINITION OF CODE "M "  

In order to allow for records for digital items to be coded for
their content, or intent, rather than as a computer file, the
following definition might be considered:

      m - Computer file
      Code m indicates that the content of the record is for a body
      of information encoded in a manner which allows it to be
      processed by a computer.  Code m is used in Leader/06 when the
      computer file characteristics are the primary aspect of the
      item.  The information in the computer file may be numeric
      data, computer software, a combination of these types, or a
      mixture of various types of computer files, none of which
      predominate.  Although a file may be stored on a variety of
      media, the file itself is independent of the medium on which
      it is stored. In case of doubt, consider a computer file. 

This definition does not mandate whether or not to use a separate
record for the digitized item, but leaves it up to the cataloger. 

      After format integration, since all variable fields are
available for all types of material, the Leader/06 code no longer
determines field validity.  Once the electronic aspects are moved
from the Leader/06, then the format issues can be divorced from the
cataloging rules.  Since field 006 can give characteristics of a
second form of material, the choice of code in Leader/06 is not
dependent upon the choice of AACR2 chapter used.  The cataloger
still needs to choose which chapter of AACR2 is appropriate for
description, which will determine, among other things, which fields
are needed in the record.  The cataloger is no longer constrained
by the format, since any USMARC defined fields will be valid.


 6.   QUESTIONS

Consideration needs to be given to the following questions:

1.  Of the types of digitized items in section 3, how should each
be encoded in Leader/06?

2.  For digital reproductions (three categories: digitized
reproductions, a portion is digitized, a collection of digitized
items that only exists digitally), how should each be represented
in the bibliographic record? In separate records, in the record for
the original, or should it be a local decision?  If in the
original, what fields should be added in addition to field 856
(e.g. 530, 533)?  Should 006 for computer files be required to be
added to the record if the record for the original is used?

3.  How important is it that the choice of Leader/06 be mandated,
or could users emphasize the aspect they want to depending upon
individual needs? Should there be options in determining the code
in Leader/06?

4.  Is it practical to consider the solution of using a cluster of
holdings fields for information about the digital reproduction? 
This would include field 856 (a holdings field), possibly field 007
and/or field 843 (the same as 533 in bibliographic), and any other
appropriate holdings fields (e.g. for serials 853-868 for holdings
data).

5.  Does the revised definition give users the flexibility to
decide whether or not to consider the digitized  item under code
"m"?  Is it clear how to code?

6.  In terms of selecting the Leader/06 code, is further
clarification needed for nontextual materials having two or more
attributes when one is not computer file (e.g. music for America,
the Beautiful on a wall chart)?
  
7.  Does the GMD and the existence of a computer file 006 suffice
to determine what is the actual physical form of the item?  For
instance, if the Leader/06 were coded "e" for cartographic material
and there is a field 006 for computer file, could one determine
whether this record represents a paper map with accompanying
computer disk or a computer file that displays maps? Note that the
GMD is not a required data element.  Does an additional data
element need to be defined (perhaps in the Leader) for carrier?  If
an element for carrier is defined, might it be used instead of
field 006 if there is no information in 006 that needs to be
conveyed?  Note that the only element in the computer file 008 (and
hence in 006) that might be useful is 008/26 (Type of computer
file), which may be redundant with the Leader/06 if the record is
coded for content.  There has also been some discussion about no
longer using GMDs, which may make a new code for carrier necessary.

8.  Should we define a new value in the Leader/06 for entities that
contain multiple modes of expression where no particular one
predominates?  If so, how would we distinguish kits, mixed
material, and this new value?  If so, how would its 008 be defined?


Go to:


Library of Congress
Library of Congress Help Desk (09/03/98)