MARBI Members:
MARBI Members: James Crooks RUSA Univ of California, Irvine Josephine Crawford (recorder) ALCTS Univ of Wisconsin, Madison Elaine Henjum LITA Florida Ctr for Lib Automation Diane Hillmann LITA Cornell University Carol Penka RUSA University of Illinois Jacqueline Riley (chair) RUSA University of Cincinnati Frank Sadowski ALCTS University of Rochester Paul Weiss ALCTS University of New Mexico Robin Wendler LITA Harvard University MARBI Interns: Anne-Marie Erickson RUSA Chicago Library System Anne Flint ALCTS Ohio Lib and Info Network Representatives and Liaisons: Joe Altimus RLG Research Libraries Group Karen Anspach AVIAC EOS International, Inc. John Attig OLAC Pennsylvania State Univ Sherman Clarke VRA New York University Betsy Cowart WLN WLN, Inc. Donna Cranmer AVC Siouxland Libraries Bonnie Dede SAC University of Michigan Catherine Gerhart CC:DA University of Washington David Goldberg NAL National Agricultural Libr Rich Greene OCLC OCLC, Inc. Rebecca Guenther LC Library of Congress Michael Johnson MicroLIF Follett Software Company Maureen Killeen A-G A-G Canada Ltd. Rhonda Lawrence AALL UCLA Sally McCallum LC Library of Congress Karen Little MLA University of Louisville Susan Moore MAGERT University of Northern Iowa Elizabeth O’Keefe ARLIS Pierpont Morgan Library Marti Scheel NLM National Library of Medicine Louise Sevold CIS Cuyahoga County Public Library Margaret Stewart NLC National Library of Canada Rutherford Witthus SAA University of Connecticut Other attendees: Jim Agenbroad Library of Congress Joan Aliprand RLG Diane Baden NELINET Mathew Beacon Yale University Candy Bogar DRA Jo Calk Blackwells Winnie Chan University of Illinois, Urbana-Champaign Tamar N. Clarke National Library of Medicine Karen Coyle University of California Larayne Dallas University of Texas at Austin Carol B. Dundle Trinity University Joanna Dyla Stanford University Stuart Ede The British Library John Espley VTLS Emily Fayen EOS International Michael Fox Minnesota Historical Society G. Fuzon EOS International Jeff Goodwin EOS International Anke Gray University of Washington Greta de Groat Stanford University Kay Guiles Library of Congress Stephen Hearn University of Minnesota Elise Hermann National Library Authority, Denmark Kimball Hewett Ameritech Library Service Charles Husbands Harvard University William Jones New York University Rhonda Kesselman Princeton University Kris Kiesling University of Texas at Austin Curtis Lavery RLG Ming Lu OCLC Rita Lunnon Stanford University Liz McKeen National Library of Canada Michael Mealling Network Solutions (and member of IETF) David Miller Curry College Paula Moehle University of Georgia Elizabeth Morgan Library of Congress Chris Mueller University of New Mexico Cecilia Preston Preston & Lynch Pat Riva McGill University Rebecca Routh Northwestern University Jacque-Lynne Schulman National Library of Medicine Ann Sitkin Harvard University Gary Smith OCLC, Inc. Gary Strawn Northwestern University Julie Tammian Best-Seller Gary Thompson UCLA Frank Williams Ingram Library Services Mathew Wise New York University Larry Woods University of Iowa Ruth Wuest Endeavor Information Systems Art Zemon DRA Joe Zeeman CGI ******************** Saturday, June 28, 1997 ******************** MARBI chairperson, Jacqueline Riley, opened the first MARBI meeting of the San Francisco conference with a round of introductions. She reviewed adjustments to the agenda, and then launched directly into the work of the conference. PROPOSAL 97-10: "Use of the Universal Coded Character Set in the MARC records" Larry Woods, chair of the MARBI Character Set Subcommittee, introduced 97- 10. It is a another part of an earlier proposal (96-10) regarding the Universal Character Set (ISO 10646); after approval of the bulk of the recommendations, it was felt that the issue of “ASCII clones” deserved more discussion and analysis. Larry reviewed the original charge and the four working principles: Principal #1: Provide support for round-trip mapping; Principal #2: Maintain the same transliteration schemes as far as possible; Principal #3: Allow for handling of modified letters similar to current practice; Principal #4: Avoid private use space if possible. In the course of its work, the Subcommittee found that some issues required placing more importance on one principle than on another. ASCII clones are characters such as numbers, punctuation marks, and special symbols that are currently shared by more than one USMARC character set. These characters are defined in the Latin, Arabic, Hebrew, and/or Cyrillic character sets and they appear in USMARC records. Round-trip mapping (from USMARC to the Universal Character Set (UCS) and then from UCS back to USMARC) is problematic because of the many-to-one relationship. The Subcommittee looked carefully at three options, and discarded one as not feasible. The remaining two are: Option #1: Map USMARC ASCII clones to a unified repertoire in the universal set. Option #2: Precede each ASCII clone by a script flag character defined in private use space. Larry reported that, due to the programming costs associated with utilizing the UCS private use space option, the Subcommittee is not in favor of Option #2. The fourth working principle of the Subcommittee supports this recommendation. However, Option #1 has one important disadvantage: pure round trip mapping may not be possible. Joan Aliprand explained exactly what would be lost in round trip mapping. For instance, a Hebrew comma (i.e., the comma of the USMARC Hebrew character set) originating in a USMARC record would translate to the comma character in a UCS record. If the same record had to be translated back into USMARC, the UCS comma character would be converted to the Latin (ASCII) comma, which is the agreed-upon mapping for the UCS comma. After conversion to UCS, it would be impossible to know that the original USMARC character had been from the Hebrew character set. Joan noted that the 8-bit USMARC hex value would not change, but the Escape sequence associated with it originally would be lost. Joan has been thinking about RLG's display algorithm for right-to-left scripts in relation to this issue. Because there is no guarantee that what comes to RLG will be exactly correct for the current algorithm, RLG will need to consider changes to it, to position the ASCII equivalents correctly. Joan emphasized that, in theory, this seems feasible but it will have to be tested out in a real implementation. Jacqueline Riley asked for a motion about ASCII clones. Paul Weiss moved in favor of Option 1. There were eight votes in favor and no votes against. Larry Woods moved on to the other recommendations in 97-10, involving specific rules and/or techniques for implementing the UCS with a mind to simplifying a mixed environment lasting some years. The Subcommittee favored not mixing USMARC 8-bit and UCS character values in the same USMARC records. UCS records should utilize 16-bit characters throughout the record. The binary zeros in the first eight bits of a UCS record could be a flag that the record contains UCS instead of 8-bit characters. Defining $d and $e in field 066 to show which UCS character set and repertoire (or subset) will be useful. The proposal also discussed the UCS method for handling diacritical characters. John Attig asked for clarification. Why is it necessary for the order of the values to change? He also asked how double diacritics would be handled. Joan Aliprand explained. In USMARC, the diacritic (often a non-spacing character) comes before the base letter. This approach was based on the printing model, where the print position is not incremented for non-spacing characters but only when the following base letter is printed. The UCS/Unicode model, where the base character comes first, reflects today's computer processing. There was also a question about how non-spacing marks are handled in USMARC in right-to-left scripts (Hebrew and Arabic). Right-to- left scripts are stored in logical order, first to last, and the non-spacing mark precedes the base letter, i.e., the same as for left-to-right scripts. Others raised the issue of Vietnamese, which has multiple diacritics. How are these diacritics handled now in USMARC? Apparently, OCLC and RLG do it one way, and LC does it another way. Sally McCallum reported that the USMARC principle is: Left to Right and Top to Bottom for left to right scripts; Right to Left and Top to Bottom for right to left scripts. In terms of how UCS handles double diacritics, a member of the audience suggested that the accent mark closest to the base letter would come first. Joan Aliprand offered to look this up [will ask her if she found the answer] Joe Altimus suggested that the recommendations made in this part of the proposal are very, very technical and require more investigation. Joan agreed, and pointed out that additional issues are raised in DP #100 on authority records. She feels that people well versed in machine processing should be asked to work on these issues so that the programming and economic ramifications are understood before a decision is made. She suggested recruiting for the Task Group from LITA and AVIAC. Charles Husbands agreed with Joan that a technical committee should look at these design/implementation issues. John Attig pointed out that there has been a longstanding request to define a few additional Greek characters for the Latin USMARC character set. Diane Hillman reminded all of the need for the section symbol. Joe Zeeman suggested that this technical task force be convened and produce recommendations fairly quickly. Paul Weiss agreed. He moved that we take the other three recommendations of Proposal 97-10, and turn them over to a technical working group with recommendations due in 1998. Membership should include representation from LITA and AVIAC. Robin Wendler provided a second to Paul's motion. There were eight votes in favor and no votes in opposition. Jacquie asked for volunteers to see her during the conference or to send her an email message afterwards. Karen Anspach will take the issue to the Monday AVIAC meeting and try to round up some volunteers. Larry Woods closed the discussion by reporting that he had been talking with OCLC and RLG representatives. It may be some time before databases are translated to UCS, although vendors think some clients may be translating back and forth before the utilities switch. ACTIONS TAKEN: -- Option 1 passed to deal with the ASCII clone issue. -- Technical working group will be established to deal with the other issues. PROPOSAL 97-14: "Addition of new characters to existing USMARC sets"” Sally McCallum introduced 97-14 as a fairly straightforward proposal. The Character Set Subcommittee, while working on the UCS mappings, uncovered some discrepancies between the published USMARC Arabic set and some implementations of that set -- notably RLG's implementation which has the largest database of Arabic vernacular records. RLIN supports three Arabic characters that are not present in the USMARC published document. They are: 1) Arabic Thousands Separator 2) Right-Pointing Double Angle Quotation Mark 3) Left-Pointing Double Angle Quotation Mark Adding these characters to the published USMARC documentation should not cause problems in existing databases. Mappings to UCS have already been done. Paul Weiss moved that MARBI approve the proposal as written. Another member provided a second. There were eight votes in favor and no votes against. The discussion moved on to some related issues. Larry Woods reported that there are several other Arabic characters present in USMARC that are not present in UCS. Therefore, RLG submitted a proposal to the Unicode Technical Committee (UTC) to add them. Joan Aliprand reported that the proposal had been accepted by the UTC for the Unicode Standard, and forwarded to ISO/IEC JTC1/SC2/WG2 as a proposed addition to ISO 10646 (UCS). There are several review levels in the ISO process. Joan also discussed the Cyrillic underscore. It is an ISO character and is also found in the British Library character set. It was not included in the proposal as it ws another ASCII clone. Now that the clone issue is settled, it could be added to the Cyrillic set. A question was asked about other characters for which there is no UCS equivalent. Two examples are the Short E (Urdu) and the Short U (Uighur), which are both vowels. Bonnie Dede thought that more investigation is needed. It is difficult to find examples since languages which use Arabic script are usually written without vowels. Even if vowel marks appear on the item being cataloged, current cataloging practice is not to transcribe the vowels. If this is true, this may be a non-issue. Joan thinks that the Short U may be available as a pre-composed letter in UCS. She will check into it and report back. Larry Woods concluded this discussion by saying that the Character Set Subcommittee will be dissolved after a joint meeting with the East Asian Character Subcommittee, now forming under the chairmanship of John Espley. ACTION: Proposal 97-14 passed with the Cyrillic underscore also added at position 5F in the extended Cyrillic set. DISCUSSION PAPER #100: "Recording Additional Characteristics in USMARC Authority Records" Sally McCallum introduced the discussion paper by saying that it is hoped to make the authority format more international, and that she expected this to be the first of several discussion papers. Sally reported that Barbara Tillett is very supportive of this direction, and has been working with an IFLA group on the international exchange of authority records. Barbara will be talking about this at the ALCTS/LITA authority group program on Monday. Sally wrote the discussion paper to look at the fundamental issues that would have to be solved to support the exchange of authority records in much larger volume than now. The paper first gives a few definitions and then it describes two logical models as follows: Model A: One record contains the heading and all related and variant reference tracings. Sally believes this model is more common in the US. Model B: Separate records are made for parallel headings in the different language catalogs with the records linked via the 7xx fields. This model is more common in Canada. Language of Catalog: There was some discussion of the need to support the Language of the Catalog (e.g. a public library in California might want to let a user select either the English or Spanish version of its online catalog) at the Midwinter meeting as part of the CanMARC alignment proposal. Option 2 was passed allowing 008/08 in the authority record to contain the five CanMARC codes, and the 038 field to be used for languages where the CanMARC codes are not appropriate. However, later discussion at LC suggested that field 040 be used, so examples in DP#100 reflect that. Language of Heading: It is easy to confuse a code representing the language of an individual heading (field specific) with the language of the catalog. The first is field specific and the second is record specific. It is not now possible to code the language of the heading, because of past analysis by LC staff showing the difficulties that would arise with ambiguous cases. Often, these ambiguous cases are headings of mixed languages. An example is: Siege d'Orleans (Mystery Play) DP#100 suggests a *repeating* $7 to get around this problem. John Attig asked what people would do with this coding. Paul Weiss responded that public libraries could ignore headings in languages that were of no interest in a local OPAC. Rich Greene said that it was useful to discuss name/title headings where mixed language headings are very frequent, he believes. John Espley reported that VTLS has several European libraries desiring language of heading code; he named the Swiss National Library and the European Parliamentary Library. Robin Wendler reported that the Harvard library at the Villa I Tatti in Florence would like this as well. In answer to Sally's question as to what is done now, Robin reported that parallel records are used (Model B) but the library is prepared to change to Model A. Donna Cranmer pointed out that many public libraries, particularly small ones, have immigrants or minorities for whom a non-English catalog would be useful. Someone pointed out that an indexing program could use the language of heading code to recognize which words should be treated as stopwords in one language but not in another. However, how would such a system handle headings in more than one language? Paul suggested that the codes could show when the heading is mixed, versus the straightforward single language heading code. Josephine Crawford suggested that the indexing program could follow a default rule in the mixed cases. Might lead the way for improvement over what happens now to mixed headings in many systems (e.g. looking for the French journal "The" is difficult when "the" is treated as a stopword). May need some research to prove that this is a common problem and that a solution is feasible. Joan Aliprand is inclined to agree with the LC analysis of the past. Even though the heading might contain a mixture of languages, it fits within the catalog where it is used; each the catalog has its own particular language which is the lanugage of the community that uses the catalog. Someone in the audience distinguished between USMARC as a communications format, versus the applications that use USMARC as a record structure. It is easier or harder to solve application-specific issues depending upon USMARC. May not always be appropriate for USMARC to try to solve each application issue. We all display records with multiple languages. Does it really make sense to worry about it? Diane Hillmann disagreed. She stated that there are libraries wanting to design catalogs with the needs of specific user groups in mind. If codes representing the language of the heading help to do this, this is a good thing. Rhonda Lawrence said that the purpose of this paper is to improve the international exchange of cataloging records. Therefore, is this coding useful to a cataloger? Yes, she believes this to be the case because cognates are confusing. She gave an example of a legal jurisdiction in another country. She does, however, think that it will be difficult to create rules for coding the language of the heading. Attig brought the group back to thinking of these codes appearing not just at the field level, but also at the subfield level, due to the presence of mixed headings. Cataloging rules are not now based upon the language of the user. Paul Weiss said that he will use Spanish headings in his catalog no matter what the cataloging rules say, if a user prefers a Spanish language catalog. Joan Aliprand sympathized with Paul's dilemma. But, she thought there could be some problems displaying Spanish cross references if you do not know under what rules the references were constructed. She asked if there was a Spanish equivalent to AACR2. Paul has a real use, and has to do something within limited resources. Joe Altimus suggested taking a look at the UniMARC Authorities format. Sally reported that it has the language of catalog at the field level. There are some sho do not understand the difference, however, and interpret the subfield as language of the heading. The language of catalog subfield might be changed to language of heading definition in the future. Robin spoke in favor of the idea of a universal record where the system or application plucked out what was needed for the given situation. Sally agreed, calling this a classic mudball record, and stating that this is what people seem to want for exchange purposes, although, there will be a large number of codes at the field and record level. Can the computer processing handle all this? Rich reported that OCLC staff are very interested in international authority control because they can't really handle the needs of the French Canadians now. Sally summed up by describing a resource authority record with appropriate computer processing at the application level. Stuart Ede from the British Library reported on the European "AUTHOR" project that had looked at the possibility of a merged authority file but decided it was impossible. They have instead created the ability to download so that catalogers can exchange records and build on them for their own needs. Paul said that it might be interesting to know how that is working. Diane agreed stating that the current situation is a horror because there is no good control. She would like to see a catalog able to establish some machine rules for processing a resource authority record, and then presenting just the right view to users. John Attig spoke in favor of coding both the language of heading and the cataloging rules used to construct the heading at the field level. Rhonda suggested looking at the Getty approach for handling various forms of artist names in many languages; the Getty file is considered to be a resource file as opposed to an authority file. She suggested thinking of the problem in a new way entirely. Others in the audience thought that this was a different concept, so Rhonda explained that the cataloger does not select the right”heading but instead lists the possibilities. Using the mudball record approach, Paul wondered how bib records would be exchanged. If there is a mudball resource record, it would be easier to swap in an English heading in place of another language in the 1xx in the bib record. The bib 1xx should store the concept”rather than a heading string. Paul said that NLM tried to push the advantages of this approach several years ago. Script: Sally gave some background on the use of the Escape (ESC) sequence in the 880 field. It is not user-friendly for those who use the records. In addition, if only one script for an individual 880 field can be indicated, it is not user-friendly when the field contains multiple non-Roman scripts. The script in the 1xx field should change depending upon the language of the country. That is, the U.S. puts the romanized version of the heading in the 1xx and the non-roman version in the 880. This is currently reversed in those countries where the language is the script of our 880. Sally asked if recording the script consistently at the field level in the authority record would be helpful, presumably as a replacement for current practice. John asked if Sally is suggesting to stop using the 880 in the bibliographic record? Sally replied that she is opening up the discussion in terms of both the authority and bibliographic records. Sally asked, what do catalogers and systems now do? Joan reported that the RLIN system supplies the 066 and $6 automatically based upon the scripts in the record. A cataloger has to type in the value "$6" only for an unpaired vernacular field. Sally asked what does the receiving system do? Joan said that the 066 tells the receiving system the scripts present in the record, for a computer processing decision as to whether or not the non-Roman data can be displayed. She also thinks that the RLIN system might use the $6 for indexing. The ESC sequence in the $6 (used to identify the first script in the field) may be a historical leftover from the early days of RLIN CJK. Gary Smith reported that catalogers do not see the script coding on OCLC; others said that the same is true with the VTLS, EOS, and RLIN systems. Maureen Killeen reported that the Canadian CATSS system doesn't use the 880. Joe Altimus said that more time was needed to analyze and give feedback on this issue. Rich Greene agreed. Joan suggested that the Technical Working Group be asked to deal with this issue. Sally would like analysis from both a systems and a cataloger point of view. She requested that discussion continue on the USMARC list. Transliteration: Paul found the Moscow example on page 12 confusing. Is it the cataloger or the author/publisher transliteration? Joan thinks it is useful to record the transliteration scheme at the field level, when it is possible. But, often the cataloger doesn't know which text the transliteration scheme applies to. She also gave an example of a mixed Hebrew/Yiddish heading. Rich said that the issue raised concerns about how to synchronize bibliographic and authority records, as they are not now created at the same time in many cases. At least, Wade-Giles transliteration is now consistently used. Country or Nationality: Paul reported that the University of New Mexico Fine Arts Library would find it useful to identify the nationality of an artist. He gave an example of a name heading of a Latin American composer; handled manually now by adding a form heading in the 655 field, e.g. Brazilian composer. Robin agreed that nationality coding could be useful, but it should be optional. Sherman Clarke reported that the Visual Resources Association has a core set of fields which includes a nationality field in the bibliographic fields; not yet in the authority set. Others in the audience affirmed that nationality coding would be useful. Paul also said that gender coding could be helpful. Concluding Remarks: Does it seem reasonable to change the 880 approach? If a local system does not use, but retains for import/export reasons, what would be the impact of a change? Sherman Clarke wondered if this coding could move from authority to bibliographic records and vice versa based upon cataloger macros. Sally asked who loads the vernacular now? Robin said that Harvard plans to do this, and John Espley reported that some VTLS libraries do it now. What should happen next? Sally asked for responses, particularly from OCLC and RLG, on the USMARC list. Sally would then like to create a second discussion paper for the Midwinter meeting. Robin suggested that the paper discuss the different purposes for each coding change. Diane Hillmann urged that subject examples be added (she was then asked to find some). PROPOSAL NO. 97-12: "Definition of separate subfields in field 536 (Funding Information Note) for program element, project, task, and work unit numbers" Rebecca Guenther introduced this proposal by saying that the joint cataloging committee from the Departments of Commerce, Energy, NASA, Defense, and Interior (called CENDI) has mapped the database structure now in use to the USMARC bibliographic format. In most cases, the mapping has been one on one. This proposal covers a group of important CENDI data elements, related to funding and contract monitoring, for which no USMARC equivalent exists. At the current time, these data elements are mapped to the 536 field, but coding specificity is lost. The proposal suggests the following: -- Add 536 $e (Program Element) -- Add 536 $f (Project Number) -- Add 536 $g (Task Number) -- Add 536 $h (Work Unit Numbers) -- Change the name of 536 $d to "Undifferentiated Number" or make 536 Subfield $d obsolete. Paul and others wondered why CENDI needs this specificity. Rebecca said that it has important meaning in this user community. Are the numbers hierarchical? Depends. If there is a hierarchy involved, Diane likes the idea of this represented in the format. Perhaps it could then be generalized to another situation. Rebecca said that there is first a project, then a task, etc. She thinks that practice is more like a timeline than a hierarchy because not all elements are always used. John Attig was nervous about second guessing this user community. But, Robin said, it is good to know as much as we can, so that the format coding is not just meaningful to people in a particular area of expertise. Paul was uncomfortable passing the proposal until more is known about the use of these numbers. Rebecca and Sally said that the field was first established back in 1979. Rich Greene said that there are about 25,000 records now in OCLC with the 536 field. To grandfather in these records, OCLC prefers renaming the 536 $d rather than making it obsolete. In this way, it is simply a matter of changing the documentation, not the system. Robin Wendler made a motion to pass the proposal, adding the new subfields and changing the name of 536 $d. Frank Sydowski provided a second. There were six votes in favor, one vote against, and one abstention. ACTION: Proposal passed. Field 536 $d will be renamed to "Undifferentiated Number" and subfields $e, $f, $g, and $h will be added. PROPOSAL NO. 97-13: "Changes to field 355 (Security Classification Control) for downgrading and declassification date" The CENDI group also requests changes to field 355 to improve security control over documents that are classified or declassified. As currently defined, 355 does not account for changes in security classification in a sophisticated way. Rebecca briefly reviewed the proposal: -- Rename $d to "Downgrading or declassification event"”and limit it to a description of an event that must take place prior to downgrading or declassification. -- Define a new $g for "Downgrading Date"” -- Define a new $h for "Declassification Date"” -- Define a new $j for "Authorization" to show by whose authority a document can be or has been downgraded. The dates would be in yyyymmdd form. Subfields $g, $h, and $j would not be repeatable. Paul asked why this is the case; can something be downgraded more than once? Yes, said Rebecca, but then the field is repeated. Paul asked if the $j is only input when there are changes in how a document is classified. Rebecca thought that it was generally not needed when the field is first input. Someone else asked if a document ever gets upgraded, in which case perhaps other subfields are needed to track these events. Rebecca said that this is unknown at this time. Paul Weiss moved that the proposal be passed as written. Robin seconded the motion. There were eight votes in favor and no votes against. ACTION: Proposal 97-13 passed. BUSINESS MEETING: Sally reported on documentation efforts by her staff: -- Bib and Authority format updates were drafted last March and are expected from the printer in August or September. Will include the changes passed as part of the work to harmonize USMARC and CanMARC. [After the meeting the Bib was pulled back and changes from the June meeting added.] -- New edition of the Relator code list expected in July. -- Not able to do the Holdings update last spring, so are targeting September or may hold it until next year. -- The Concise is expected to be sent this fall for printing. Will be available online first, however. -- LC staff are in discussion with the National Library of Canada about the documentation issues relating to the harmonization. The next edition may be a single edition, with LC doing the English edition and NLC responsible for the French edition. -- Sally noted that some mappings are beginning to appear on the USMARC Web page. -- Classification records are now available as a MARC distribution service from CDS. Book vendor information from Harrasowitz and Cassalini is now being loaded at LC in USMARC format. OCLC and RLG are also doing more of this. South Africa has decided to adopt USMARC, switching from UNIMARC. Brazil has been a multi-format country in the past. Now that Brazilian libraries are joining OCLC, there is a movement towards USMARC. John Espley has been appointed the chair of the new EACC Task Force. The charge is to establish a mapping between the East Asian Character Sets and the Universal Character Set (UCS). Randall Barry, Beatrice Ohta, Bonnie Dede, and Candy Bogar have volunteered to serve as well. The Committee will follow the same principles as the Woods committee. John Attig asked if the Committee has a timeline? Jacquie said that the Committee can determine this. Jacquie reviewed a couple of conference programs of interest to the MARBI group. The joint meeting with CC:DA to discuss metadata will take place on Monday morning. Jacquie asked if everyone had received the list of questions. She will bring extra copies to the meeting. Jacquie reported that MARBI has been asked to co-sponsor (in name only) a program with the Committee on Cataloging Asian materials. Claire Dunkle was present to answer questions. The program will discuss how to handle the vernacular in authority records, fitting in with the ALA theme of international connections. Claire said that they are seeking speakers to present the topic clearly to generalists. It was requested that the meeting not be held when MARBI is meeting, so that members could attend. Claire promised to try her very best to satisfy this request, but she could not guarantee. Paul made a motion that MARBI co-sponsor this program, no matter when held. Elaine Henjum provided the second. There were eight votes in favor of the motion and no votes against it. Jacquie has been working with the three divisions so that MARBI membership and intern appointments are regularized and selecting the chair is handled more easily. There has been one new appointment. Christine Mueller will be a LITA intern in 97/98. John Attig asked how the UKMARC harmonization is going. Unfortunately, Stuart Ede had left at this point. Sally responded that the British Library is expanding UKMARC each year, and thereby moving to harmonization gradually. She thinks about 30 fields were added last year. One issue involves the core fields (like 245 and 300) which have many more subfields because subfield codes are used to replace punctuation rather than the USMARC practice of using them to delineate access points. Marti Scheel reported that the National Library of Medicine is going to implement the 6xx $v when the next MeSH update comes out in the fall. More information will appear on the USMARC list. ********************* Sunday, June 29, 1997 ********************** The meeting opened with some news about the joint meeting planned for Monday, June 30. John Attig reported that the ALCTS Task Force on Metadata recommends that MARBI and CC:DA work together on the issues involving metadata, cataloging, and the USMARC format. Jacquie asked for people to think about this recommendation before the meeting on Monday. She also passed out the CC:DA questions and a “crosswalk”document. PROPOSAL NO. 97-9: "Renaming of Subfield 865 $u to accommodate URNs" Many people attending this meeting were able to go to the morning's program "URIs, Metadata, and the Dublin Core" at the Sheraton Palace, providing a useful foundation for this proposal. Another source of information is the Preston/Lynch/Daniel paper on using existing bibliographic identifiers (like ISBN and ISSN) in the URN framework and syntax (available by FTP at ds.internic.net/internet-drafts/draft-ietf-urn- biblio-00.txt). In introducing this proposal, Rebecca Guenther said that there is an immediate need at LC to code a "handle" in an 856 field that it can then be resolved by a handle server to a URL. Some participants of the National Digital Library Project would also like to record handles. Referring to the program at the Sheraton Palace, Rebecca said that some of what Cliff Lynch discussed makes her think now that the recommendations in this proposal are on the right track. It is necessary to accept that different naming authorities will name objects in different ways; USMARC format planning must proceed from this point of view. One question to discuss is how to record a handle. The proposal recommends changing the name of the 856 $u from "Uniform Resource Locator" to "Uniform Resource Identifier" so that either a URL or a URN (including a handle) can be recorded. The $u can be repeated in the case where multiple identifiers are desired. Given the rapidly changing situation, where experimentation is useful and important, multiple identifiers are expected in some records. Alternately, a new subfield can be considered for URIs, since some concerns were raised on the list with mixing address types in the $u. John Attig asked if the functionality of URLs and URNs is the same, even though the operational concept is clearly different, one being a server address and the other being a logical address. John asked if the rest of the 856 field would be handled in the same way, if a URN is input; what other data elements would be input at the same time? Joe Altimus replied that he considers the URN and URL to be similar in function. Although RLG staff has no consensus as of yet on the issue, it does seem an annoyance to have different address types in the same subfield. Robin Wendler considers the URN to be like a call number. It is not an exact shelf location, but a surrogate for the location of the item. Even though the 856 seems appropriate at this time, she introduced the thought that a call number field might be better in the future. Rebecca responded that there is no guarantee of a one-to-one correspondence between a bibliographic record and a URN; therefore, the 856 seems better. Rich Greene said it was OK with OCLC staff to place the URN in the 856 field, but there are strong feelings not to use the $u. The current software assumes a URL, and a user can click and be directed to the electronic resource. This is not yet possible with URNs given OCLC's current software. Michael Mealing stated that Web browsers generally deal with unknown protocols, and asked if the OCLC system can do this. Rich thought not. Michael said that URNs and URLs act the same way programmatically, so it is possible to put both in the same subfield since future browsers will handle the two. If a decision is made today to separate the two into two subfields, than it means future computer processing will have to look in both places. He recommended against this course. Diane Hillmann registered an objection to the $u name change on conceptual grounds. The 856 is a holdings tag but is kept in the bib format for convenience, since most systems don't support the holdings record yet. There is both a verification role as well as a retrieval role. Rebecca returned to Robin's call number analogy, reminding everyone that the 852 field contains the call number in the holdings format. But, it is necessary to keep in mind that different naming authorities will decide upon the rules (i.e. syntax, definitions, resolution methods, etc.) and not the library community. Therefore, it won't be possible to fit this into our current call number structure in a reliable way. Karen Coyle said that she hadn't yet worked with handles and wanted to know more about them. She is familiar with PURLs which look like URLs. Josephine Crawford wondered if the difference between a handle and a PURL has to do with the syntax and the resolution server software. Cecilia Preston said that there could be 7-8 URNs describing the same object. The situation is very fluid right now, and it is not possible to find a clean and easy solution. She suggested thinking of these different URN types similar to having both an ISBN and a call number for the same book. Both refer to the same entity, but have different syntax and different purposes. John Attig wondered if this then means that we record the ISBN twice, once in the 022 and once in the 856. Cecilia suggested taking a look at her paper. Mike Mealing discussed the URN structural goal: to be context-free. Unless you have the authority to understand the string, don't derive a URN from an ISBN. It should be an opaque string that acts by itself. Karen Coyle wondered if different syntax in the $u (i.e. URL: versus URN:) would be enough to distinguish the two. Rich said not with the current OCLC software. Diane pointed out that the 856 field is now repeatable. The current approach works well for URLs. Why monkey around by using the same subfield for something different? Don't we have years of experience teaching us to code separately for different data elements? Mike said that we are simply naming and then resolving; he believes that there will be a standard resolving procedure to program in our systems. Cecilia said that the situation has to shake out, and the 856 $u is ok until the dust settles. Diane replied that, when we put something in one place, it stays there historically. Rebecca said that a new naming authority can create new numbers at will; do we want to create a new holdings record then? People said no! Karen Anspach also recommended a separate subfield code for the URN. She felt that wise OPAC management required caution; need to avoid presenting a usable URL versus a non-usable URN to catalog users at this time. Rebecca said that the indicator could be used to control this. Diane and others asked what is the disadvantage of defining a separate subfield code? Rebecca said that there aren't many subfields left, but perhaps $g could be redefined. Jacquie wondered when the software would be developed to process both the URNs and URLs with a single click by the user. Gary Smith said that it is no harder to check for two subfields than to work through differences in the same subfield. If the URN is coded in a separate subfield, will it be repeatable? Yes. Frank Sadowski moved to pass the proposal as written. There was a second by Carol Penka. There were four votes in favor, and five against. The motion did not pass. Robin Wendler moved to pass the proposal with a different subfield code. LC staff will need to determine the best code, after an investigation into what is now used at LC. Paul provided a second. The motion included defining the # indicator code for the 856 field meaning "No information provided". (John Attig asked for some direction in the documentation on how to code the # indicator if there are both URL and URN recorded.) There were nine votes in favor of Robin's motion, and no votes against. ACTION: Proposal 97-9 passed. A new 856 subfield will be defined to hold the URN. PROPOSAL NO. 97-8: "Redefinition of subfield $q (File transfer mode) in field 856 of the USMARC formats" Rebecca introduced the proposal by saying that, if the subfield $q is redefined, it would take care of a current problem with mapping from the Dublin Core, the GILS format, and others to the USMARC format. These other formats carry a position for electronic format type (also known as MIME type or Internet Media Type). The Internet is designed to support the integration of multimedia resources, often by triggering software based upon the file format. Often a file extension is used to show a video or audio file, but it is still useful to include the file format in the description of the resource. The USMARC format does not have a clearcut place for the file format. Currently both the 516 (Type of File) and 538 (System Detail Notes) fields are used, and the 538 carries other information as well. Rebecca has surveyed other systems and determined that the 856 $q has been used only to show binary/ascii format. It seems reasonable to expand the 856 $q to satisfy this need. Rebecca advised against creating a new 856 subfield, as so few are left. Paul Weiss stated that it is most useful if standard codes are used. One possibility is the IANA (Internet Assigned Numbers Authority), which is a central registry for specific values. Rebecca agreed, but said we can only encourage the use of standard codes at this time, as the Dublin Core does not require codes. Therefore, free-text format data will also be mapped to the $q. Paul moved to accept the proposal as written, taking into account the error on page 5 which states that the $q should be repeatable. As explained in the body of the proposal, the $q should not be repeatable. If there is more than one format, there should be different 856 fields. Diane Hillmann made the second. There were eight votes in favor, and no votes against. ACTION: Proposal 97-8 passed as written, with correction to make $q not repeatable. PROPOSAL NO. 97-3R: "Redefinition of code "m" (Computer file) in Leader/06 in the USMARC Bibliographic Format" Rebecca introduced the proposal to the audience by explaining that this is the fourth time that this issue has come before MARBI. It is a complex issue, and everyone's thinking has grown with each discussion. Rebecca looked carefully at the concerns expressed by OCLC, and OCLC provided many examples which were very helpful. Therefore, the revised proposal has a more explicit definition for Leader/06 "m" code and instructions have been added. “In case of doubt or if the most significant aspect cannot be determined, the cataloger should consider the item a computer file.” Rebecca also explored using the 008 field for identifying computer files where the significant aspect requires something other than Leader/06 "m" coding. Her thought was to use 008/23 (Form of item) in Books/Serials/Music/Mixed Materials, and to create a new 008 position in Maps and Visual Materials. She then saw how scattered”this information would be, adding programming overhead. So, she settled on using the 007 field across the board. This also means that a bibliographic record describing a work can have an 856 field (and matching 007) for the electronic version, and other holdings fields describing the paper version. John Attig wondered about the phrase "alphanumeric" in the "m" definition. He gave the example in the book world where a table is created frequently containing alphanumeric data. Paul Weiss agreed, stating that numeric data is the same whether paper or electronic. Betsy Mangan reported cartographic data is text and therefore treated like language material. Robin Wendler brought up the example of statistical data files--the way they are used changes the nature of the data. She would like the primary aspect to be the numeric data, but could accept falling back on secondary aspect coding. Rebecca said that she didn't want to remove "numeric data" from the definition because it was part of the original definition. Margaret Stewart asked if computer-oriented multimedia would be the same or similar to interactive mulit-media. Yes, several answered. Paul suggested trying to keep the definition simpler to make life easier for catalogers. He said that if the definition requires the cataloger to make a decision about the significant aspect of the item, there will be no consistency as different catalogers will make different decisions on the same item. Betsy agreed that this problem will exist, but said that there is no way to avoid it. Knowing the signficant aspect does have its uses. Diane Hillmann agreed that consistency is not possible. Some libraries will establish conventions and will prefer one over the other. She did not see this issue as a major impediment. Robin discussed Harvard's current methods for automatic duplicate detection. She wondered how to detect programmatically that the video Hamlet does not equal the paper Hamlet. John Attig said that AACR2 is not explicit enough. Paul disagreed, stating that the problem is more the format. Robin spoke up for cataloging guidelines, to help catalogers decide on the significant aspect. Frank Sydowski said that he thought all this was clear when he first read the proposal. But, given this discussion, he wondered how the following situations would be handled. Let's say that the cataloger has three diskettes: 1) An Annual report in a Wordperfect file (consider this to be text) 2) A statistical file in Excel (consider this to be a computer file) 3) A JPEG graphic file (consider this to be a graphic) Only the second one would be coded Leader/06 "m" probably. Jacquie Riley asked if there was any more discussion about the definition. She asked for a straw poll, including everyone in the room, not just voting members. She asked how many people think that the phrase "alphanumeric data" should remain in the definition? Only 7-8 people voted in favor of this. There was discussion saying that this is confusing and that the real intent is to limit to statistical data, meaning numerical data that may have some alphas included but is non-narrative. Jacquie asked for a straw vote in favor of changing "alphanumeric data" to "numeric data". There were 18 people in favor, and 15 people against. If numeric were excluded from the definition, how would catalogers handle it, and handle it consistently? A decision about significant aspect still has to be made. Sally suggested that numeric would fall into the "etc" part of the definition. Paul spoke up against this, saying that he wants to ensure consistency and thinks that "etc" should be removed. Jacquie asked for a straw vote to remove "etc" from the definition. There were about 25 in favor, and only 3 against this proposal. The discussion moved on to the 007 field. Joe Altimus reported that the examples in 97-3R are very helpful, but he would also have appreciated a multimedia example. John Attig asked why prefer the 007 over the 006? Robin said that, even though there are drawbacks to the 007, she likes it better than the 006 because she can legally carry information down to the holdings record when dealing with a multiple-versions record. John responded that the 006 has a clear relationship to the Leader. Rebecca said that the 007 seems appropriate since we are talking about the physical characteristic of an item. If a cataloger is describing the original, and the agency doesn't create holdings records, the cataloger can put the 856 in the bib record; but then, the 007 cannot be mandatory. Paul said that if requiring the secondary aspect to be coded is a good idea, than he prefers the 006 over the 007. Rich Greene said that OCLC staff are struggling with the existing ambiguity, and would like the physical carrier to be very explicit. Rebecca suggested reviewing the 008 idea again. Rich said that OCLC staff liked the 008 idea, even though there was a scattering between two character positions. Marti Scheel reported that NLM staff are concerned about the complexity of the 008. Rich explained why he thinks the 008 is a cleaner approach, using the analogy of microforms. Paul wondered about two 007 fields in the same record, one for a remote file and one for a CD-ROM. How would your system know which 007 goes with which 856? Not a problem if the fields are contained in a holdings record, but this is a problem if contained in a bib record. Rich said that there is a need to maintain four aspects: content, carrier, type of control, and publisher control. The problem is stuffing four aspects into three data elements. Paul said that it comes down to what should be mandatory in the record. Diane came back to the problem about the holdings format not yet implemented in many systems. OPAC users will want to limit searches, and pairing up an 007 with an 856 is important when more than one variation of a work is recorded in a single bib record. Limiting by location might be possible, but this is not as clean as having a holdings record. Jean Hirons reported that CONSER has to send records to ISSN Center now. How should she show computer format? Now a mixed bag. Jean believes that the 007 would work the best, supporting the need for the holdings format. The current CONSER practice is to not use an 007 when adding an e-journal 856 field to a print serial bib record. John Attig said that there is so much record exchange at this time, that he considers it very important to put mandatory coding in the format. Paul disagreed, saying why should everyone have to assume this workload. Robin predicted that the next-generation system will do more with the 007 and the holdings format. She therefore felt it appropriate for the USMARC format to use the 007 in this way. There are no cost/benefit studies that deal with the cost of coding and the later benefit that can be derived from the processing of that coding. A member in the audience supported always knowing that something is in the electronic format, as the secondary aspect. He went on to say that it is unfortunate that there is no nice, neat way to identify this in USMARC, but that the combination of Leader/06, 007, and 856 is workable. Paul said that the computer would only need to look at the 007 and 856, since the Leader/06 would describe the primary aspect. Jacquie asked if there is more discussion. Paul asked about accompanying materials. If an 007 for computer files is to be mandatory in all cases, what about accompanying materials? The problem comes back to not having a holdings record in many systems. Paul suggested making this be the one acceptable exception. John Attig asked to consider the 008 alternative again. Jacquie asked for a straw poll in favor of the 007 over the 008. There were 22 votes in favor, and 8 votes against. Diane Hillmann moved to accept the proposal as follows: -- In the definition of "m" code, change "alphanumeric data" to "numeric data". -- In the definition of "m" code, remove "etc." from the end of the first sentence. -- Make 007 field mandatory for electronic resources except for accompanying materials. Jacquie provided a second to the motion. There were seven votes in favor, and one vote against. Paul asked a question about Attachment A. If the Leader/06 is not "m" then some of the 008/26 values could be made obsolete because they're redundant. He thought that this would be a good idea for something like "k" (representational). Rebecca promised to think about this for the next meeting. ACTION: Propoal 97-3 passed with three changes as stated above. BRITISH LIBRARY UPDATE: At yesterday's Business Meeting, a question was asked about progress with the UKMARC/USMARC harmonization project. Stuart Ede from the British Library was not present during the Business Meeting, so Sally asked him now to say a few words. Stuart said that the project is being phased over several years. The BL put out a consulting paper earlier in the year describing what is unique in USMARC and not present in UKMARC. Comments were received in April showing a general acceptance, and to add these unique data elements to UKMARC, since this would be helpful. One comment had a suggestion that might be considered for USMARC. So this might signal a trend of general yet critical acceptance of USMARC. The two major areas of difference between UKMARC and USMARC are ISBD punctuation and multi-volume works. Currently comments are being gathered on these outstanding areas of concern. Stuart closed by saying that UK librarians see great benefit in the harmonization project. He said that the general approach is not to diverge from USMARC from this point forward, and to try to harmonize to the extent possible. PROPOSAL NO. 97-11: "Definition of Subfields in Field 043 (Geographic Area Code) and 044 (Country of Publishing/Producing Entity Code) to accommodate indication of Subentities in the USMARC Bibliographic, Community Information (043 only) and Authority (043 only) Formats" Sally McCallum referred back to DP#98, discussed at the Washington, D.C. meeting, which explored the issue of subentity geographic codes. The conclusion at that time was to add a $b and $c to the 044 field, and to add a $b and $2 to the 043 field. Paul wondered why the proposal doesn't also include adding a $2 to the 044 field, since it seems the normal practice to show the source of a code. Would this represent too much added work for LC staff? Sally said that this would not be a problem, and she is comfortable with adding a $2. Field 044 $b, proposed for the Bibliographic format, would contain a local country's code, whereas 044 $c would contain the code from ISO 3166-2. David Goldberg explained that the ISO coding scheme includes both the entity and the subentity. He supplied the example of BR-MT (Brazil- Madagascar). Sally confirmed that this was so. Sherman Clarke asked if it was necessary to indicate in 008/15-17 that there is a subentity code in 044. Diane questioned if this was necessary. Field 043 $b and $2 are proposed for the Bibliographic, Community Information, and Authority formats. Subfield $b would contain the local Geographic Area Code (GAC). Sally reminded the group that the GAC code can be no longer than seven characters. Paul Weiss moved to pass the proposal as follows: -- Define 044 $b calling it simply "Local Code". -- Define a 044 $2 calling it "Source of Local Code" -- Define 043 $b as proposed in 97-11. -- Define 043 $2 as proposed in 97-11. Elaine Henjum provided a second. There were eight votes in favor, and no votes against the motion. ACTION: Proposal passed with some wording changes and an additional $2 in the 044 field. DISCUSSION PAPER #101: Notes in the USMARC Holdings Format Rebecca introduced the paper, by reviewing the questions at the end of the paper. John Attig opened the discussion by saying that he feared the subject is very complicated. The 853/863 pairs are linked to one another, and copy- specific notes should be linked to the 853/863. He believes that fields mentioned in the paper (533, 540, and 583) are copy-specific; the only reason they appear in bibliographic records is because holdings records did not exist in systems when they were defined. He suggested that it is time to revisit the issue of embedding holdings data in bib records. Paul Weiss supported the idea of moving these fields to the holdings record. Robin, working with this type of data alot at Harvard, gave some wonderful examples. Paul expressed some concern about cataloger training, if there was this change. Diane reported that Cornell Rare Books dept prefers for local copy-specific notes to appear on the bib screen, not on the holdings screen. Perhaps they should appear in both places, but be ordered in some logical way on the bib record. Rebecca suggested that there needs to be a $8 in the 852 as a way to order multiple holdings notes. None of this resolves the issue that there is not enough vendor support for the holdings format. Adding these copy-specific notes to the holdings record will increase the pressure on vendors. Robin reported that Harvard is talking with vendors about this issue. She sees the lack of indexing of the holdings record as one problem with storing these fields solely in the holdings. Paul suggested that the implementation be done in stages; make the fields valid in holdings now, but don't make them obsolete in the bib record until the support for the holdings record improves. Mike Johnson supported the idea of item-specific information in the 852 in the bibliographic format. He said that the 5xx is not used a lot, but the 852 is used! Joe Altimus reported that the RLG experts on the Z39.50 standard said that the standard doesn't handle separate holdings records very well. Then, is this really a good idea? Robin gave an example. You have a multiple version record with the microform described in an 007 and 533 field. Harvard is able to export records in two ways: (1) separate bib record with one holdings embedded; (2) separate bib record and separate holdings record. When exporting the record under the current USMARC holdings format, the 541, 561, and 562 fields are lost. Robin doesn't believe that a linking subfield $8 would be of help in this situation. She described how important it is to track the different provenances for the different copies; right now this is far too messy in the bibliographic record. Donna Cranmer emphasized that her library shares the same problem. John Attig asked if the archivists have considered this overall issue yet? Apparently not, but it is an important issue for them. John suggested a proposal at the next meeting that would propose that fields 541, 561, and 562 be added to the holdings format but not (yet) made obsolete in the bib format. Another paper should also discuss the overall issues involved with the bib/holdings formats. There was a general consensus in the room to do this. Jacquie also summarized the pros/cons relating to using the same field in both the bib and holding formats. The 533 and 540 in the bib format map to the 843 and 845 fields in the holdings format, whereas the 583 is the same in both format. Given the current mixed situation, what is best from a cataloger and system point of view? ACTION: A proposal will be brought to the 1998 Midwinter meeting ****************** Monday, June 30, 1997 *********************** DISCUSSION PAPER NO. 102: Non-filing characters” This paper presents the problems and possible solutions for dealing with non-filing characters associated with variable field data. Sally McCallum reported that the problems have been around for quite a while. The paper was drafted by Randall Barry, with pros/cons presented for the various techniques he identified. Unfortunately, the solutions could be very expensive to implement in systems. Paul Weiss said that he felt Randall did a good job on the issue. It would be excellent to come to some closure on this longstanding problem, even though it will involve quite a bit of work in the short term. Paul suggested that the future is longer than the present, and that the pain would be worth the gain. The control character or subfield solutions appeal to Paul at this point. Marti wondered if $0 [zero] could be used instead of $1 [one] as the subfield delimiter that would indicate non-filing characters. John Attig reported that his system handles articles through automatic recognition, so catalogers don't have to bother with this. Robin asked how ambiguous articles were handled (i.e. the German article "die"” versus the English word)? John said that system would assume the English language so it would not treat "die" like the German article. It was pointed out that this system would not handle the English title of the play "A My Name is Alice".”Even with these problems, Michael Johnson tended to prefer the machine approach. Robin wondered if the subfield solution was only supposed to deal with non- filing characters at the beginning of a field, or if the same subfield could be embedded in the field. The discussion paper assumes that the subfield approach could be used for embedding, but points out that this would be awkward for computer processing. David Goldberg said that he feared spacing problems might occur on display of records. Karen Coyle preferred the graphical-character-approach solution. In this way, the non-filing data is stored in the same subfield as the data itself. However, what character(s) could be used? Perhaps the right/left arrows? Gary Smith asked if this would be input, and how the USMARC format would store these characters? Paul reported that many systems show the EOF character via a graphical character, and that this is helpful. The input question should not drive the issue, nor should processing or display questions. Michael Johnson suggested the bracketed approach; the linear, on/off approach would ease the computer processing issue. Karen Coyle agreed that a beginning/ending character is best. Sally asked if all this applies to initial articles only? No. John Attig asked if the method could be applied to punctuation as well? Right now punctuation is usually normalized out, but the rules of normalization differ from system to system and the catalogers have no control over this. The cataloging rules should give guidance on what to ignore, if an improved method was implemented. There was a consensus that the USMARC format should provide a tool or method, but that the instructions on when to ignore something should come from the cataloging rules. Karen Anspach said EOS has European customers who would very much like to ignore articles embedded in fields. Marti Scheel asked about the ANSI or NISO standards that deal with sorting/filing and with indexing. Does this standard have any relationship to this issue? Sally has some memory of this standard, but will have to check on it. Even though filing and indexing are related, these activities would be handled differently. Jacquie asked if there is a growing consensus to do something, but the question is what is the best method? A pair of unused characters (graphical or control?) to come before and after the data to be ignored. Graphic characters are on the keyboard and can be seen. Control characters are not usually visable. Perhaps a combination to make the trigger really unique? How should something be selected? Perhaps look at the control/graphical characters available in UCS, except for the fact that people want to move this forward more quickly. Karen spoke up in favor of a combination of characters that would not normally be combined, and that are in opposite direction of one another as this would help the user visually recognize the non-filing characters. It was agreed that a proposal should be moved along and that some serious research is needed to find the preferred control characters. Big databases should be scanned to see if certain combinations uncover any records. Rich asked if existing records would have to be modified. OCLC would not want to do this. He suggested that field indicators should continue to be used, but that the new approach should be used for *embedded* non-filing characters. Michael Johnson pointed out that there are two different situations: non-filing characters that lead a field, and non-filing (or usually non- indexing) characters that are embedded. Diane Hillmann said that it was realistic to expect that the present methods (indicators to ignore an article (or) do not input the article) will exist side-by-side with the new method of surrounding the non-filing characters with some unique combination of characters. Robin wondered about this issue and the UCS approach to diacritics where the diacritic is expressed after the letter rather than before. Isn't mapping from USMARC a problem? Won't this undercut the linear solution being suggested here? Gary Smith said that the UCS implementors will have to deal with the moving diacritic issue, and make sure that reverse mapping is maintained. Karen Coyle said that she believes this issue supports the bracketing solution. ACTION: A proposal will be developed by LC staff. It would be very helpful to send Sally many examples. DISCUSSION PAPER NO. 103: "Current uses of the 028 (Publisher Number) and the 037 (Source of Acquisition) in the Bibliographic Format" Karen Little, representing the Music Library Association, introduced the paper. She gave an example of the publisher of a libretto that used the same number for both the stock number and the publisher number. Where should this number be recorded? In both 028 and 037 fields? Another situation are video numbers; they are now recorded in the 028, but might more accurately belong in the 037, except that indexing is desired. Music librarians began to look at the overall use of these two fields. John Attig asked if it is generally true that the 028 is indexed, but that the 037 is not. Karen replied that this is true generally, but there are important exceptions like VTLS, Melvyl, and RLIN. Melvyl and RLIN index in separate indexes. There was general agreement that the distinction between the two fields should be based upon the inherent nature of the data, and not the indexing capability of the home system, and that the general difference should be the bibliographic significance of the number. Paul asked if the first indicator is used. Yes, for display purposes. Karen reviewed the two options in the discussion paper and said that MLA prefers Option 1. Option 1 has two suboptions, and MLA has no consensus about them as of yet. There are concerns depending upon the usage of the current indicators. John reported that OLAC prefers Option 1 also. Paul asked why Option 2 is not favored. Karen said that there are still some true stock numbers around that are useful to acquisitions staff but have no bibliographic significance. John said that, if field 037 were made obsolete, it would be necessary to move a lot of numbers to the 028. Frank Williams asked if the question is just about the indexing problem in the 037. John said no, the problem is larger than that. He suggested expanding the 028 definition but leaving the 037 field in place. Paul was concerned that John's suggestion would make life more complicated for catalogers. Diane Hillmann said that the instructions can tell catalogers what to do when in doubt. Karen Coyle said that the 028 is a music field, and yet video numbers are now appearing in it because music publishers are branching out. Jacquie said that we are not discussing a general move to put any publisher number in the 028. Robin wasn't so sure, saying that the publisher numbers are heavily used in the music world, and indexing is important. Melvyl indexes both 028 $a and $b together. She spoke against combining the 037 and 028 fields since it will add other stuff to the index and make it less useful to the music community. She and Diane agreed that there are some limited exceptions where it is useful to put a book number in the 028 field; the example given is a bibliography of a composer by a music publisher. Rebecca suggested a name change to recognize music/score/video, but if in doubt a cataloger should not use the 028 but should use a 500 note instead. David pointed out that there is a special field for a certain subject area. Diane didn't see this as a problem. The consensus was to narrowly extend field 028 to allow for music and videorecording related material. ACTION: There will be a proposal at Midwinter. Minutes prepared by Josephine Crawford December 1997