PROPOSAL NO: 97-3

DATE: December 15, 1996
REVISED:

NAME: Redefinition of Code "m" (Computer file) in Leader/06 in the USMARC Bibliographic Format

SOURCE: Library of Congress

SUMMARY: This paper explores the redefinition of code "m" in Leader/06 (Type of record) to code electronic items for content rather than carrier. It reviews previous discussions of this issue and considers how the proposed redefinition might affect the use of fields 008 and 006. Two options are proposed: one to redefine code "m" to include software, numeric data or a mixture of forms, and the other to redefine as just computer software. In addition, it proposes opening up the definition of Mixed materials (code "p") to include electronic mixed materials, and to change the name of the Books 008 to "Textual (Nonserial)".

KEYWORDS: Leader/06 (Bibliographic); Type of record; Electronic reproductions; Computer files; Mixed materials

RELATED: 97-7 (February 1997); DP97 (July 1996); DP92 (January 1996); 95-9 (June 1995)

STATUS/COMMENTS:

12/15/96 - Forwarded to USMARC Advisory Group for discussion at the 1997 Midwinter MARBI meetings.

2/20/97 - Results of USMARC Advisory Group discussion - Deferred. Some participants felt that the proposal was not well enough developed to move forward with. LC should come back with another proposal that address OCLC concerns: 1) that the choice in Leader/06 should be clear 2) that guidelines need to be consistently applied from one environment to the next, and 3) that there be somewhere in the record to determine whether the record describes an electronic or nonelectronic version. The latter requirement needs to be in a mandatory field, not optional field such as 006 or 007. LC should come back with another proposal for the summer meeting that reflect these concerns and that include examples for all types of materials.

2/26/97 - Results of final LC review - Agreed with the MARBI decision.


PROPOSAL NO. 97-3: Redefinition of Code "m" (Computer file) in Leader/06

1.      BACKGROUND
With the completion of the last phase of Format Integration in
early 1996 MARC bibliographic records may contain coding for more
than one set of characteristics in the new field 006 (Fixed-Length
Data Elements--Additional Material Characteristics).  Leader/06
(Type of record) contains a code that is used to determine what
type of 008 (Fixed-Length Data Elements) is included in the record;
the 008 character positions vary in 008/18-34 depending upon the
type of material as coded in the Leader.  Field 006 includes
applicable codes that would otherwise be coded in 008/18-34, so
that additional information may be given for other additional
aspects of the item.  A choice must be made as to which form of
material the field 008 should be coded for.  In terms of
description, the decision as to which form is primary and which is
secondary does not have much impact, since all characteristics can
also be given in the record.  However, the Leader/06 code is used
for many purposes, particularly for retrieval of records.  Format
Integration opens up the opportunity to supply more information
about the item than in the past, but it also brings up many
questions about how to apply this new flexibility.
        
In our current environment, distinctions between types of material
have become blurred.  With the advent of the personal computer and
the growth of the Internet it becomes questionable whether
categorizing all digital material as a computer file is useful for
retrieval and manipulation of bibliographic records.  If all
digital material were coded as a computer file the record for a
computerized version of, for example, an original photograph will
be coded differently than the record for the original (if separate
records are created).  This may cause problems for retrieval,
particularly in systems that separate records by form of material. 
Also, because of economic considerations, many users are adding
information about the digital item on the MARC record for the
original, rather than creating a separate record.
The coding in Leader/06 was discussed on three previous occasions
at meetings of the USMARC Advisory Group.  Proposal No. 95-9
(Encoding of Digital Maps in the USMARC Bibliographic Format) was
considered by the USMARC Advisory Group in June 1995.  It proposed
renaming code "e" in Leader/06 from "Printed map" to "Cartographic
material" so that all maps, whether digital or print, could be
coded the same (there is also a code for "manuscript map"). 
Because of the increasing number of digital map images becoming
available (resulting partly from digital library projects and the
Content Standards for Geospatial Metadata), this change was
considered necessary for the map community.  In many cases the
bibliographic record for the paper copy will contain information
about the location of the digital image.  This paper brought up the
issue of coding for content rather than for physical carrier. 
Although the portion of the proposal concerning Leader/06 was
approved, it was suggested that a broader discussion paper be
presented.
Discussion Paper No. 92 was presented to the USMARC Advisory Group
in January 1996.  It explored changing the definition of code "m"
in Leader/06 so that it is used only for executable software. 
There was general agreement that in cases where the content of the
electronic material is clear, that identifying the primary record
type in the Leader/06 by its content rather than carrier better
served users.  These cases include electronic text, music CDs,
digital maps, digital photographs, etc.  Participants felt that to
define code "m" as only executable software was too restrictive for
various reasons: the growing existence of hybrids which include
pictures, graphics, text, software; files that don't fit into a
category, e.g. survey data; and, defining "m" only as executable
software would not allow input of an 006 for computer file
characteristics for electronic text, since its secondary
characteristic is not an executable.  The group thought that it was
likely that each constituency would need to issue guidelines.
Another discussion paper was requested for the USMARC Advisory
Group meeting in July.
Discussion Paper No. 97 was presented in July 1996.  The consensus
of the group was that it was desirable to change the definition of
code m so that one does not have to code everything digital that
way.  It was requested that a proposal be written to include a
redefinition of code m with the coding of Leader/06 for digital
items dependent upon the content of the item, rather than how it is
represented.  It was suggested that two options be presented, one
for code m to include executables, data sets, and raw data and
another more narrow one just for executables.  The group also
recommended that code o (Kit) be considered for multimedia or that
its definition be clarified to distinguish it.

2.      LEADER/06
Earlier discussions detailed the many uses systems make of the
Leader/06.  Some of these are: separating databases based on form
of material; sorting records; matching records for duplicate
detection; selecting subsets for products distributed.  Because of
the many uses of this coded element, the decision as to which
characteristic to consider primary has great impact for retrieval
and manipulation of the record.  The 006 can give the additional
descriptive information for the secondary form of material,
although not all systems are currently using it for retrieval the
same way as 008 is used.
The current definition of Leader/06 in the USMARC Format for
Bibliographic Data, code "m" for Computer file is:
        m - Computer file
        Code m indicates that the content of the record is for a body
        of information encoded in a manner which allows it to be
        processed by a computer.  The information in the computer file
        may be numeric or textual data, computer software, or a
        combination of these types.  Although a file may be stored on
        a variety of media (such as magnetic tape or disk, punched
        cards, or optical character recognition font documents), the
        file itself is independent of the medium on which it is
        stored. 
This definition implies that any electronic item needs to be
identified in the Leader/06 as a Computer file. It also implies
that a separate record would need to be made for a electronic
reproduction, since the record for the original would be coded
according to the original carrier/content.  Opening up this
definition so that the institution is not mandated to categorize
all electronic items as computer files allows for more flexibility
and was considered generally desirable in previous discussions of
this issue.

3.      CODING FOR CONTENT             
  
Technological advances have resulted in numerous projects to
digitize existing material.  Librarians want to provide descriptive
information and access to these materials through catalog records. 
With the definition of Field 856 (Electronic Location and Access),
a bibliographic record can be created with a link to the electronic
location of the item. Subfield $3 (Materials specified) allows for
electronic location information to be given for a subset of the
item in the record.  In special collections, particularly in visual
materials, institutions have often chosen to include field 856 in
the record for the original to give information about the
electronic item.  In these cases, additional descriptive
information about the item in its electronic form has not been
needed, but only information about the location and access to it. 
This technique has been attractive because of the retrieval
problems when the electronic item is cataloged separately as a
computer file, and the economic considerations of creating a
separate record.  It allows for the focus of the record to be on
the content of the item, rather than its carrier.
Coding for the content of the item for electronic materials would
be consistent with the method used for handling microforms.  The
USMARC bibliographic format in Leader/06 says the following:
        "Microforms, whether original or reproductions, are not
        identified by a distinctive Type of record code.  The type of
        material characteristics described by the codes take
        precedence over the microform characteristics of the item."
There is not a separate Leader/06 code for microform, although
there is one for computer file.
        By handling electronic items in Leader/06 as language,
cartographic, music, etc., records for digital reproductions would
not be separated from the originals, and there would be flexibility
for record creation (i.e., using one record and adding a field 856
with location information of the digital reproduction or creating
a separate record).  The statement above about the treatment of
microforms could be revised to include electronic materials.

4.      REVISED DEFINITION OF CODE "M "  
In order to allow for records for digital items to be coded for
their content, or intent, rather than as a computer file, the
following definition might be considered:
        m - Computer file
        Code m indicates that the content of the record is for
        information that is processed by a computer and whose most
        significant aspect does not fall into any other Leader/06
        category, i.e., the computer file characteristics are the most
        significant aspect of the item.  Computer files that fall
        under this category include numeric data, computer software,
        a combination of these types, or a mixture of various
        Leader/06 categories exclusive of categories o (Kit) and p
        (Mixed materials), none of which predominates.  Although a
        file may be stored on a variety of media, the file itself is
        independent of the medium on which it is stored. In case of
        doubt, consider a computer file. 
This definition does not mandate whether or not to use a separate
record for the electronic item, but leaves it up to the cataloger. 
Alternatively, a narrower definition might be considered:
        m - Computer file
        Code m indicates that the content of the record for is one or
        more software programs.  This category includes executable
        software, source code, etc.  Any other type of computer-
        readable file is coded for the most significant form of
        material observable when processed for display, e.g., files
        that are primarily textual or numeric are treated as language
        material.  If there are two or more forms and they are judged
        significant, use code p for mixed materials.  
See below for a discussion of code p.
After format integration, since all variable fields are available
for all types of material, the Leader/06 code no longer determines
field validity.  Once the electronic aspects are moved from the
Leader/06, then the format issues can be divorced from the
cataloging rules.  Since field 006 can give characteristics of a
second form of material, the choice of code in Leader/06 is not
dependent upon the choice of AACR2 chapter used.  The cataloger
still needs to choose which chapter of AACR2 is appropriate for
description, which will determine, among other things, which fields
are needed in the record.  The cataloger is no longer constrained
by the format, since any USMARC defined fields will be valid.

5.      Fields 006 and 007
Additional information about the computer file aspects may be given
in field 006 and/or in field 007.  These may be used by systems for
limiting or sorting records.  Field 006 is used for additional
bibliographic characteristics, while field 007 is used for physical
characteristics (particularly the specific medium designation
(SMD)).  A general material designator (GMD) may be given in 245$h
to indicate that the physical format is computer file (although
this information is optional); it is not necessary for the 008 to
agree with the GMD, so the 008 for the content of the item would be
given.
The 008 character positions specific to computer files (and
consequently their 006) contains three defined elements: target
audience (CF008/22, also defined in Books, Music, and Visual
Materials); type of computer file (CF008/26); and government
publication (CF008/28; also defined in Books, Maps, Serials, and
Visual materials).  If the record were coded for content in
Leader/06 and 008, a computer file 006 may be added.  However,
generally only the type of computer file would be useful, and in
many cases it would be redundant with the value in Leader/06.  For
example, if the record were for text and coded "a" in Leader/06,
adding "d" for document in 006/09 (the same as 008/26) would be
redundant.  Thus, there may be no additional information to include
in the 006 (although some systems currently need the first
character position for searching or limiting by material type). 
The physical characteristics, however, might be more useful and may
be included in the computer files 007, such as the specific
material designation (e.g., remote).
Note that if the narrower definition for code m were used (i.e.,
executable software only), an 006 could not be added for many
electronic items, since they may not include executables as a
secondary characteristic.  In this case, an 007 could provide added
information and indicate that the carrier of the item is a computer
file.
Attachment A shows how each type of computer file for which there
is a definition in 008/26 would be coded under both options if code
m were redefined to emphasize content rather than carrier.
6.      Multimedia items
The question arose in the previous discussions of this issue as to
how to code multimedia items which are becoming increasingly
prevalent.  In these cases several kinds of material interact and
it is impossible to determine predominance.  If the narrower
definition of code m is used, a code will need to be found for
these items.  If the broader definition is used, they could be
included in code m, although other options might be considered.  
When Discussion Paper No. 97 was discussed, it was suggested that
code o (Kit) be considered for multimedia items and perhaps the
definition of it could be broadened.  However, since items that use
code o use the 008 for Visual materials, this solution would
require changing the 008 for kit, which may not be desirable.
Broadening the definition of code p (Mixed material) might be
considered as an option.  At a meeting of U.S. and Canadian
archivists in Toronto in October, the archival community expressed
a willingness to broaden the definition of Mixed material to
accommodate mixtures other than those that are archival in nature,
since they preferred using the Leader/08 to bring out archival
material.  The definition of code p is as follows:
        p - Mixed materials
        Code p indicates that the content of the record is for two or
        more types of material that are usually related by virtue of
        their having been accumulated by or about a person or body. 
        No one type of material in the group is emphasized or
        predominates.  The intended primary purpose is other than for
        instructional purposes (i.e., other than the purpose of those
        materials coded as o (Kit)).  This category includes archival
        and manuscript collections of mixed types of materials, such
        as textual materials, photographs, and ephemera.
Note that Proposal No. 97-7 (Coding Leader/06 and Leader/08 for
archival material) considers changing this definition to clarify
its use further for archival material.  It might also be revised so
that it is not limited to "material that are usually related by
virtue of their having been accumulated by or about a person or
body".  The definition could also include multimedia works intended
to be processed by a computer.  
The Mixed Materials 008/18-34 contains only one value, Form of
item.  It may not provide particularly useful information, but
could be enhanced if there were additional characteristics of
importance for these types of materials.

7. Impact of changes on systems
Since some systems physically separate records by the type of 008,
changing the definition of code m will have an impact on systems. 
For digital materials now, the 008 currently would be computer
files for all but maps, unless a separate record had not been
created and field 856 added to the record for the original  Even if
separation of record types is not an issue, integrated systems
would have many records with computer files 008s that should have
other 008s. 
In RLIN, textual computer file serials are in a computer files file
with a computer file 008 and a serial 006.  It may be necessary to
move these.  Non-textual serials that would no longer classify as
computer files, may also need to be moved.  
If the narrower definition of code m were chosen, then a computer
file 006 would no longer be appropriate for much electronic
material which would be coded for content, since code m would be
restricted to executable software.  In this case, the computer file
007 would have to be used to code computer file characteristics and
used by the system for searching by material type.  RLIN already
uses the CF 007 as one of the characteristics examined when
qualifying a search by material type, but OCLC and WLN do not
currently use the CF 007 for sorting and limiting searches.  If
this proposal is approved, the latter systems would need to make
system changes if it is deemed desirable to identify this
characteristic in some way for its retrieval potential.
OCLC uses the Leader/06 for duplicate detection (as may other
systems); in previous discussions of this issue the difficulty in
doing duplicate detection on an optional field (006) was noted.  If
this proposal is approved, OCLC would probably need to make system
changes to no longer use Leader/06 for this purpose.
Other system impacts might be illustrated in Attachment A, which
shows how each type of computer file might be coded in Leader/06,
field 008 and 006 under each option.

8.      Label for Books 008/006
If this proposal is approved, electronic textual material will fall
under Leader/06 value a, now called "language material".  The
USMARC bibliographic format specifies that these materials will
determine the 008 used by the value in Leader/07 (Bibliographic
level); if a monograph it uses code "m" and the Books 008 and if a
serial it uses code "s" and the Serials 008.  Thus, monographic
electronic textual material will require a Books 008.  It may be
desirable to change the label of the 008 from Books to "Textual
(Nonserial)".  Systems often use the term specified in the USMARC
documentation to show the contents of the field, and a label
"Books" could be very confusing to the user if the item is
electronic text.  Note that this issue also has arisen from
concerns by the archival community; see Proposal No. 
97-7 (Coding Leader/06 and Leader/08 for Archival Material).  It
might also be considered whether the name of Leader/06 might be
changed from "Language material" to "Textual material", since
videorecordings and nonmusical sound recordings might also be
considered language material.  
9.      Questions for further consideration:
1.  How will code p be distinguished from code o in Leader/06 if
Option 2 or 3 is chosen? Multimedia items previously considered
computer files may be intended for instructional purposes.
2.  Should additional elements be defined for the Mixed Materials
008 to include characteristics of multimedia items?  

10.     PROPOSED CHANGES
The following is presented for consideration:
        *       In the USMARC Bibliographic Format, change the definition
                of code "m" as follows:
                Option 1:
                m - Computer file
                Code m indicates that the content of the record is for
                information that is processed by a computer and whose
                most significant aspect does not fall into any other
                Leader/06 category, i.e., the computer file
                characteristics are the most significant aspect of the
                item.  Computer files that fall under this category
                include numeric data, computer software, a combination of
                these types, or a mixture of various Leader/06 categories
                exclusive of categories o (Kit) and p (Mixed materials),
                none of which predominates.  Although a file may be
                stored on a variety of media, the file itself is
                independent of the medium on which it is stored. In case
                of doubt, consider a computer file. 
                Option 2:
                Change the definition of code m in Leader/06 to restrict
                it to executable software and change the definition of
                code p as follows (note that this definition includes
                changes also proposed in Proposal No. 97-7):
                m - Computer file
                Code m indicates that the content of the record is for
                one or more software programs.  This category includes
                executable software, source code, etc.  Any other type of
                computer-readable file is coded for the most significant
                form of material observable when processed for display,
                e.g., files that are primarily textual or numeric are
                treated as language material.  If there are two or more
                forms and they are judged significant, use code p for
                mixed materials.  
                p - Mixed materials
                Code p indicates that the content of the record is for a
                mixture of components from two or more of the other Type
                of Record categories defined for Leader/06 exclusive of
                category o (Kit), each of which is judged to be
                significant.  This category includes archival fonds and
                manuscript collections of mixed forms of material, such
                as text, photographs, and sound recordings.  It also
                includes computer-readable material such as multimedia
                works that include electronic text, images, sound, etc.
                Option 3:
                Option 2 but include numeric data in the definition of
                code m.
        *       In the USMARC Bibliographic Format, redefine the 008 for
                Books to "Textual (Nonserial)"

-------------------------------------------------------------------
                                             ATTACHMENT A
                                       Types of computer files 
This chart shows each type of computer file for which there is a
value in Computer files 008/26 (Type of computer file) and how each
would be coded in Leader/06, 008 and 006 under the two options
detailed in this proposal.  In some cases, coding for 006 will give
no further information than is in the 008, since the only character
position in the CF008 that is different from the other material's
008 is 008/26.  If this proposal is approved, the type would
already be coded in leader/06 if coding the record for its content
rather than its carrier.  These items for which 006/09 (same as
008/26) information is redundant are identified as such in the 006
column.  In the 006 columns "+ specifics" means that additional
006's may be added for other forms of material.
                       Option 1        Option 2        Option 1       Option 2
008/26 type            Ldr06/008       Ldr06/008       006            006             
a Numeric              m/CF            a               n.a.           m
b Computer             m/CF            m               n.a.           n.a.
  program
c Representational     k/VM            k               m (Redundant)  n.a.
d Document             a/Bk or Ser     a               m (Redundant)  n.a.
e Bibliographic        a               a               m              n.a. or m*
  data
f Font                 m               m               n.a.           n.a.
g Game                 m               p               m + specifics  m
h Sound                i               i               m              n.a. or m*
i Interactive          m               p               specifics      m
  multimedia
j Online system        m               p               n.a.           m
  or service
m Combination          m               p               n.a.           m*
For Option 3, everything is the same as Option 2 except Numeric
data would be coded as m.
*For Option 2, if an executable is part of the resource, then m
could be used in 006. This would have to be determined.


Go to:


Library of Congress
Library of Congress Help Desk (09/02/98)