Authority File Comparison Rules (NACO Normalization)

(9/16/98, revised Feb. 09 2001)

When a new authority record is added to an authority file, each heading is compared against the headings in the file to determine whether the new headings are unique, i.e., adequately differentiated from existing headings. The headings already in the file and the one to be added are normalized before comparison so that only certain characters will be allowed to differentiate between headings. The normalization rules for this uniqueness edit are specified in Appendix A. The rules for comparison are given below.

Rules for Comparison

The following rules apply to all LC/NACO name, LC/NACO series, and LCSH subject (MARC 21) 008/11=a) authority records for authorized heading and subdivision records (MARC 21008/09=a, d, or f), but not to reference records (MARC 21008/09=b or c). (LC subject headings for children's literature are also excluded, 008/11=b.)

The files under discussion are built under the rules specified in AACR and in LCSH. The files may be segmented (e.g., name and subject) or combined at a file location. The rules state how to treat the relationships among headings in name authority records and those in subject authority records.

No comparisons are made across name and subject authority records.

The heading strings undergoing comparisons according to these rules are assumed to be normalized according to the attached character normalization table in Appendix A. Comparisons are between tag groups, with the following conventions:

1. Name headings. (Tags 100/110/111/130/151)

1.1. A field 100, 110, 111, 130, 151 may not normalize to any field 100, 110, 111, 130, 151 in another record.

1.2 A field 500, 510, 511, 530, 551 must normalize to the same string as a 100, 110, 111, 130, 151 in the same tag group in another record.

1.3 A field 400, 410, 411, 430, 451 may not normalize to the same string as any 100, 110, 111, 130, 151 in the same or another record. (Note a)

1.4 A field 4XX may normalize to the same string as another 4XX in the same or another record.

1.5 A field 7XX is not compared.

Note a: If on input, a 4XX normalizes to the same string as a 1XX in the same or another record, then one of the two is qualified.

Exceptions: (1) if it is not possible to qualify a conflicting 100 or 400, then the 400 is changed to a 500; (2) when a 4XX is a pre-AACR2 reference, the reference is deleted, since such 4XXs cannot be qualified or changed into 5XXs.

2. Subject headings. (Tags100/110/111/130/151/150/155/180/181/182/185) (Note b)

2.1 A field 100, 110, 111, 130, 151 may not normalize to any field 1XX in another record. (Note c)

2.2 A field 150, 155, 180, 181, 182, 185 may not normalize to the same string in that same tag group in another record.

2.3 A field 500, 510, 511, 530, 550, 551, 555, 580, 581, 582, 585 must normalize to the same string as a 100, 110, 111, 130, 150, 151, 155, 180, 181, 182, 185 in the same tag group in another record.

2.4 A field 400, 410, 411, 430, 451 may not normalize to the same string as any 1XX in the same or another record; a field 450, 455, 480, 481, 482, 485 may not normalize to the same string as a 150, 155, 180, 181, 182, 185 in the same tag group, in the same or another record.

2.5 A field 4XX may normalize to the same string as another 4XX in the same or another record.

2.6 A field 7XX is not compared.

Note b: These rules also apply to subject headings for children (008/11) = b) when comparing within that thesaurus. Headings for children are not compared with LCSH headings.

Note c: Fields 111 and 130 are strings duplicated from name headings. Records with these headings may contain a reference structure appropriate to subjects. LC does not plan to create 181 headings that conflict with 151 headings (geographic subdivision forms will be recorded in field 781 if appropriate).

Appendix A

(A list of new MARC 21 characters to be added to these tables is available on the CPSO web site)

Normalization Table (9/16/98, revised Feb. 09 2001)
Character:	Converted Value:	Comments: (Blank = space)
LEADING BLANKS	Delete
TRAILING BLANKS	Delete
MULTIPLE BLANKS	Blank	Compress to a single blank.
LETTERS		Convert all letters to the same case (upper or lower).
DIACRITICS	Delete	Miagkiy znak, alif, ayn, tverdyi znak, pseudo question mark, grave, acute, circumflex, tilde, macron, breve, superior dot, umlaut or dieresis, hacek, angstrom, ligature, high comma off center, double acute, candrabindu, cedilla, right hook, dot below character, double dot below character, circle below character, double underscore, underscore, left hook, right cedilla, double tilde, high comma centered, middle dot, upadhmaniya

Translated Characters
Character:	Converted Value:	Comments:
SUPERSCRIPT NUMBERS	Numbers	Convert to non-superscript equivalent.
SUBSCRIPT NUMBERS	Numbers	Convert to non-subscript equivalent.
AE DIGRAPH	"AE"
OE DIGRAPH	"OE"
CROSSED D	"D"
ETH	"D"
TURKISH I	"I"
POLISH L	"L"
SCRIPT L	"L"
HOOKED O	"O"
HOOKED U	"U"
SLASHED O	"O"
ICELANDIC THORN	"TH"
ALPHA	"A"
BETA	"B"
GAMMA	"G"

Punctuation
Character:	Converted Value:	Comments: (Blank = space)
!	Blank
"	Blank
APOSTROPHE '	Delete
LEFT PAREN. (	Blank
RIGHT PAREN. )	Blank
HYPHEN -	Blank
LEFT BRACKET [	Delete
RGT. BRACKET ]	Delete
LEFT BRACE { (Opening curly bracket)	Blank
RIGHT BRACE } (Closing curly bracket)	Blank
LEFT ANGLE BRACKET & #60;	Blank
RIGHT ANGLE BRACKET & #62;	Blank
SEMICOLON ;	Blank
COLON :	Blank
PERIOD .	Blank
QUESTION MARK ?	Blank
INVERTED QUESTION MARK ¿	Blank
INVERTED EXCLAMATION MARK ¡	Blank
COMMA ,	"," or Blank	The first comma in the $a subfield (unless it is a terminal comma) is taken into account. In all other cases commas are converted to blank.

Other Numbers and Special Characters
Character:	Converted Value:	Comments:
DATES	Numbers	(Retain unchanged.)
ROMAN NUMERALS	Letters	(Retain unchanged.)
NUMBERS	Numbers	(Retain unchanged.)
FLAT SIGN	flat sign	(Retain unchanged.)
SHARP SIGN/HATCH #	#	(Retain unchanged.)
SLASH /	Blank
BACKWARD SLASH \	Blank
AT SIGN @	Blank
AMPERSAND &	&	Normalization changed from blank to "&" in 12/1990.
ASTERISK *	Blank
VERTICAL BAR \|	Delete
PERCENT %	Blank
EQUALS =	Blank
PLUS +	+	Normalization changed from blank to "+" in 9/1998.
PLUS AND MINUS	Blank
LOGICAL "OR"	Blank
LOGICAL "NOT"	Blank
SUPERSCRIPT + - ( )	Blank
SUBSCRIPT + - ( )	Blank
PATENT SIGN & #174;	Blank
PHONORECORD SYMBOL	Blank
COPYRIGHT SYMBOL & #169;	Blank
DOLLAR SIGN $	Blank
BRITISH POUND SIGN & #163;	Blank
DEGREE SIGN & #176;	Blank
SPACING CIRCUMFLEX & #94;	Blank
SPACING UNDERSCORE & #95;	Blank
SPACING GRAVE	Blank
SPACING TILDE & #126;	Blank

Content Designation
Character:	Converted Value:	Comments:
INDICATORS	Delete
SUBFIELD DELIMITERS	Hex 1F	Subfield delimiters are retained, except the first delimiter, which precedes the data string, would be dropped.
SUBFIELD CODES	Delete
TAG		The tag is used to point to the population against which a normalized string is to be compared for a uniqueness edit (see Authority File Comparison Rules).

(prepared by LC Network Development and MARC Standards Office)

Top of Page

NACO Home >>

The Library of Congress >> Cataloging >> PCC Home
January 3, 2008

Authority File Comparison Rules (NACO Normalization)

(9/16/98, revised Feb. 09 2001)

Rules for Comparison

Appendix A

Normalization Table

Translated Characters

Punctuation

Other Numbers and Special Characters

Content Designation