Skip Navigation Links The Library of Congress >> Cataloging >> PCC Home
NACO: Program for Cooperative Cataloging
  NACO Home >>
Find in

Authority File Comparison Rules (NACO Normalization)

(9/16/98, revised Feb. 09 2001)

When a new authority record is added to an authority file, each heading is compared against the headings in the file to determine whether the new headings are unique, i.e., adequately differentiated from existing headings. The headings already in the file and the one to be added are normalized before comparison so that only certain characters will be allowed to differentiate between headings. The normalization rules for this uniqueness edit are specified in Appendix A. The rules for comparison are given below.


Rules for Comparison

The following rules apply to all LC/NACO name, LC/NACO series, and LCSH subject (MARC 21) 008/11=a) authority records for authorized heading and subdivision records (MARC 21008/09=a, d, or f), but not to reference records (MARC 21008/09=b or c). (LC subject headings for children's literature are also excluded, 008/11=b.)

The files under discussion are built under the rules specified in AACR and in LCSH. The files may be segmented (e.g., name and subject) or combined at a file location. The rules state how to treat the relationships among headings in name authority records and those in subject authority records.

No comparisons are made across name and subject authority records.

The heading strings undergoing comparisons according to these rules are assumed to be normalized according to the attached character normalization table in Appendix A. Comparisons are between tag groups, with the following conventions:

1. Name headings. (Tags 100/110/111/130/151)

1.1. A field 100, 110, 111, 130, 151 may not normalize to any field 100, 110, 111, 130, 151 in another record.

1.2 A field 500, 510, 511, 530, 551 must normalize to the same string as a 100, 110, 111, 130, 151 in the same tag group in another record.

1.3 A field 400, 410, 411, 430, 451 may not normalize to the same string as any 100, 110, 111, 130, 151 in the same or another record. (Note a)

1.4 A field 4XX may normalize to the same string as another 4XX in the same or another record.

1.5 A field 7XX is not compared.

Note a: If on input, a 4XX normalizes to the same string as a 1XX in the same or another record, then one of the two is qualified.

Exceptions: (1) if it is not possible to qualify a conflicting 100 or 400, then the 400 is changed to a 500; (2) when a 4XX is a pre-AACR2 reference, the reference is deleted, since such 4XXs cannot be qualified or changed into 5XXs.

2. Subject headings. (Tags100/110/111/130/151/150/155/180/181/182/185) (Note b)

2.1 A field 100, 110, 111, 130, 151 may not normalize to any field 1XX in another record. (Note c)

2.2 A field 150, 155, 180, 181, 182, 185 may not normalize to the same string in that same tag group in another record.

2.3 A field 500, 510, 511, 530, 550, 551, 555, 580, 581, 582, 585 must normalize to the same string as a 100, 110, 111, 130, 150, 151, 155, 180, 181, 182, 185 in the same tag group in another record.

2.4 A field 400, 410, 411, 430, 451 may not normalize to the same string as any 1XX in the same or another record; a field 450, 455, 480, 481, 482, 485 may not normalize to the same string as a 150, 155, 180, 181, 182, 185 in the same tag group, in the same or another record.

2.5 A field 4XX may normalize to the same string as another 4XX in the same or another record.

2.6 A field 7XX is not compared.

Note b: These rules also apply to subject headings for children (008/11) = b) when comparing within that thesaurus. Headings for children are not compared with LCSH headings.

Note c: Fields 111 and 130 are strings duplicated from name headings. Records with these headings may contain a reference structure appropriate to subjects. LC does not plan to create 181 headings that conflict with 151 headings (geographic subdivision forms will be recorded in field 781 if appropriate).


Appendix A

(A list of new MARC 21 characters to be added to these tables is available on the CPSO web site)

Normalization Table

(9/16/98, revised Feb. 09 2001)
Character: Converted Value: Comments:
(Blank = space)
LEADING BLANKS Delete  
TRAILING BLANKS Delete  
MULTIPLE BLANKS Blank Compress to a single blank.
LETTERS   Convert all letters to the same case (upper or lower).
DIACRITICS Delete Miagkiy znak, alif, ayn, tverdyi znak, pseudo question mark, grave, acute, circumflex, tilde, macron, breve, superior dot, umlaut or dieresis, hacek, angstrom, ligature, high comma off center, double acute, candrabindu, cedilla, right hook, dot below character, double dot below character, circle below character, double underscore, underscore, left hook, right cedilla, double tilde, high comma centered, middle dot, upadhmaniya

Translated Characters

Character: Converted Value: Comments:
SUPERSCRIPT NUMBERS Numbers Convert to non-superscript equivalent.
SUBSCRIPT NUMBERS Numbers Convert to non-subscript equivalent.
AE DIGRAPH "AE"  
OE DIGRAPH "OE"  
CROSSED D "D"  
ETH "D"  
TURKISH I "I"  
POLISH L "L"  
SCRIPT L "L"  
HOOKED O "O"  
HOOKED U "U"  
SLASHED O "O"  
ICELANDIC THORN "TH"  
ALPHA "A"  
BETA "B"  
GAMMA "G"  

Punctuation

Character: Converted Value: Comments:
(Blank = space)
! Blank
" Blank
APOSTROPHE ' Delete  
LEFT PAREN. ( Blank  
RIGHT PAREN. ) Blank  
HYPHEN - Blank  
LEFT BRACKET [ Delete  
RGT. BRACKET ] Delete  
LEFT BRACE {
(Opening curly bracket)
Blank  
RIGHT BRACE }
(Closing curly bracket)
Blank  
LEFT ANGLE BRACKET & #60; Blank  
RIGHT ANGLE BRACKET & #62; Blank  
SEMICOLON ; Blank  
COLON : Blank  
PERIOD . Blank  
QUESTION MARK ? Blank  
INVERTED QUESTION MARK ¿ Blank  
INVERTED EXCLAMATION MARK ¡ Blank  
COMMA , "," or Blank The first comma in the $a subfield (unless it is a terminal comma) is taken into account. In all other cases commas are converted to blank.

Other Numbers and Special Characters

Character: Converted Value: Comments:
DATES Numbers (Retain unchanged.)
ROMAN NUMERALS Letters (Retain unchanged.)
NUMBERS Numbers (Retain unchanged.)
FLAT SIGN flat sign (Retain unchanged.)
SHARP SIGN/HATCH # # (Retain unchanged.)
SLASH / Blank  
BACKWARD SLASH \ Blank  
AT SIGN @ Blank  
AMPERSAND & & Normalization changed from blank to "&" in 12/1990.
ASTERISK * Blank  
VERTICAL BAR | Delete  
PERCENT % Blank  
EQUALS = Blank  
PLUS + + Normalization changed from blank to "+" in 9/1998.
PLUS AND MINUS Blank  
LOGICAL "OR" Blank  
LOGICAL "NOT" Blank  
SUPERSCRIPT + - ( ) Blank  
SUBSCRIPT + - ( ) Blank  
PATENT SIGN & #174; Blank  
PHONORECORD SYMBOL Blank  
COPYRIGHT SYMBOL & #169; Blank  
DOLLAR SIGN $ Blank  
BRITISH POUND SIGN & #163; Blank  
DEGREE SIGN & #176; Blank  
SPACING CIRCUMFLEX & #94; Blank  
SPACING UNDERSCORE & #95; Blank  
SPACING GRAVE Blank  
SPACING TILDE & #126; Blank  

Content Designation

Character: Converted Value: Comments:
INDICATORS Delete  
SUBFIELD DELIMITERS Hex 1F Subfield delimiters are retained, except the first delimiter, which precedes the data string, would be dropped.
SUBFIELD CODES Delete  
TAG   The tag is used to point to the population against which a normalized string is to be compared for a uniqueness edit (see Authority File Comparison Rules).

(prepared by LC Network Development and MARC Standards Office)

Top of Page Top of Page
  NACO Home >>
Find in
  The Library of Congress >> Cataloging >> PCC Home
  January 3, 2008
Contact Us