Authority File Comparison Rules (NACO Normalization)
(9/16/98, revised Feb. 09 2001)
When a new authority record is added to an authority file, each
heading is compared against the headings in the file to determine
whether the new headings are unique, i.e., adequately differentiated
from existing headings. The headings already in the file and the
one to be added are normalized before comparison so that only certain
characters will be allowed to differentiate between headings. The
normalization rules for this uniqueness edit are specified in Appendix
A. The rules for comparison are given below.
Rules for Comparison
The following rules apply to all LC/NACO name, LC/NACO series,
and LCSH subject (MARC 21) 008/11=a) authority records for authorized
heading and subdivision records (MARC 21008/09=a, d, or f), but
not to reference records (MARC 21008/09=b or c). (LC subject headings
for children's literature are also excluded, 008/11=b.)
The files under discussion are built under the rules specified
in AACR and in LCSH. The files may be segmented (e.g., name and
subject) or combined at a file location. The rules state how to
treat the relationships among headings in name authority records
and those in subject authority records.
No comparisons are made across name and subject authority
records.
The heading strings undergoing comparisons according to these
rules are assumed to be normalized according to the attached character
normalization table in Appendix A. Comparisons
are between tag groups, with the following conventions:
1. Name headings. (Tags 100/110/111/130/151)
1.1. A field 100, 110, 111, 130, 151 may not
normalize to any field 100, 110, 111, 130, 151 in another record.
1.2 A field 500, 510, 511, 530, 551 must
normalize to the same string as a 100, 110, 111, 130, 151 in the
same tag group in another record.
1.3 A field 400, 410, 411, 430, 451 may not
normalize to the same string as any 100, 110, 111, 130, 151
in the same or another record. (Note a)
1.4 A field 4XX may normalize to the
same string as another 4XX in the same or another record.
1.5 A field 7XX is not compared.
Note a: If on input, a 4XX normalizes to the
same string as a 1XX in the same or another record, then one of
the two is qualified.
Exceptions: (1) if it is not possible to qualify
a conflicting 100 or 400, then the 400 is changed to a 500; (2)
when a 4XX is a pre-AACR2 reference, the reference is deleted,
since such 4XXs cannot be qualified or changed into 5XXs.
2. Subject headings. (Tags100/110/111/130/151/150/155/180/181/182/185)
(Note b)
2.1 A field 100, 110, 111, 130, 151 may
not normalize to any field 1XX in another record. (Note
c)
2.2 A field 150, 155, 180, 181, 182, 185 may
not normalize to the same string in that same tag group
in another record.
2.3 A field 500, 510, 511, 530, 550, 551, 555,
580, 581, 582, 585 must normalize to the same string as
a 100, 110, 111, 130, 150, 151, 155, 180, 181, 182, 185 in the
same tag group in another record.
2.4 A field 400, 410, 411, 430, 451 may not
normalize to the same string as any 1XX in the same or another
record; a field 450, 455, 480, 481, 482, 485 may not normalize
to the same string as a 150, 155, 180, 181, 182, 185 in the same
tag group, in the same or another record.
2.5 A field 4XX may normalize to the
same string as another 4XX in the same or another record.
2.6 A field 7XX is not compared.
Note b: These rules also apply to subject headings
for children (008/11) = b) when comparing within that thesaurus.
Headings for children are not compared with LCSH headings.
Note c: Fields 111 and 130 are strings duplicated
from name headings. Records with these headings may contain a
reference structure appropriate to subjects. LC does not plan
to create 181 headings that conflict with 151 headings (geographic
subdivision forms will be recorded in field 781 if appropriate).
Normalization Table
(9/16/98, revised Feb. 09 2001) |
Character: |
Converted Value: |
Comments:
(Blank = space) |
LEADING BLANKS |
Delete |
|
TRAILING BLANKS |
Delete |
|
MULTIPLE BLANKS |
Blank |
Compress to a single blank. |
LETTERS |
|
Convert all letters to the same case (upper or
lower). |
DIACRITICS |
Delete |
Miagkiy znak, alif, ayn, tverdyi znak, pseudo
question mark, grave, acute, circumflex, tilde, macron, breve,
superior dot, umlaut or dieresis, hacek, angstrom, ligature,
high comma off center, double acute, candrabindu, cedilla, right
hook, dot below character, double dot below character, circle
below character, double underscore, underscore, left hook, right
cedilla, double tilde, high comma centered, middle dot, upadhmaniya |
Translated Characters |
Character: |
Converted Value: |
Comments: |
SUPERSCRIPT NUMBERS |
Numbers |
Convert to non-superscript equivalent. |
SUBSCRIPT NUMBERS |
Numbers |
Convert to non-subscript equivalent. |
AE DIGRAPH |
"AE" |
|
OE DIGRAPH |
"OE" |
|
CROSSED D |
"D" |
|
ETH |
"D" |
|
TURKISH I |
"I" |
|
POLISH L |
"L" |
|
SCRIPT L |
"L" |
|
HOOKED O |
"O" |
|
HOOKED U |
"U" |
|
SLASHED O |
"O" |
|
ICELANDIC THORN |
"TH" |
|
ALPHA |
"A" |
|
BETA |
"B" |
|
GAMMA |
"G" |
|
Punctuation |
Character: |
Converted Value: |
Comments:
(Blank = space) |
! |
Blank |
" |
Blank |
APOSTROPHE ' |
Delete |
|
LEFT PAREN. ( |
Blank |
|
RIGHT PAREN. ) |
Blank |
|
HYPHEN - |
Blank |
|
LEFT BRACKET [ |
Delete |
|
RGT. BRACKET ] |
Delete |
|
LEFT BRACE {
(Opening curly bracket) |
Blank |
|
RIGHT BRACE }
(Closing curly bracket) |
Blank |
|
LEFT ANGLE BRACKET & #60; |
Blank |
|
RIGHT ANGLE BRACKET & #62; |
Blank |
|
SEMICOLON ; |
Blank |
|
COLON : |
Blank |
|
PERIOD . |
Blank |
|
QUESTION MARK ? |
Blank |
|
INVERTED QUESTION MARK ¿ |
Blank |
|
INVERTED EXCLAMATION MARK ¡ |
Blank |
|
COMMA , |
"," or Blank |
The first comma in the $a subfield (unless it
is a terminal comma) is taken into account. In all other cases
commas are converted to blank. |
Other Numbers and
Special Characters |
Character: |
Converted Value: |
Comments: |
DATES |
Numbers |
(Retain unchanged.) |
ROMAN NUMERALS |
Letters |
(Retain unchanged.) |
NUMBERS |
Numbers |
(Retain unchanged.) |
FLAT SIGN |
flat sign |
(Retain unchanged.) |
SHARP SIGN/HATCH # |
# |
(Retain unchanged.) |
SLASH / |
Blank |
|
BACKWARD SLASH \ |
Blank |
|
AT SIGN @ |
Blank |
|
AMPERSAND & |
& |
Normalization changed from blank to "&" in 12/1990. |
ASTERISK * |
Blank |
|
VERTICAL BAR | |
Delete |
|
PERCENT % |
Blank |
|
EQUALS = |
Blank |
|
PLUS + |
+ |
Normalization changed from blank to "+" in 9/1998. |
PLUS AND MINUS |
Blank |
|
LOGICAL "OR" |
Blank |
|
LOGICAL "NOT" |
Blank |
|
SUPERSCRIPT + - ( ) |
Blank |
|
SUBSCRIPT + - ( ) |
Blank |
|
PATENT SIGN & #174; |
Blank |
|
PHONORECORD SYMBOL |
Blank |
|
COPYRIGHT SYMBOL & #169; |
Blank |
|
DOLLAR SIGN $ |
Blank |
|
BRITISH POUND SIGN & #163; |
Blank |
|
DEGREE SIGN & #176; |
Blank |
|
SPACING CIRCUMFLEX & #94; |
Blank |
|
SPACING UNDERSCORE & #95; |
Blank |
|
SPACING GRAVE |
Blank |
|
SPACING TILDE & #126; |
Blank |
|
Content Designation |
Character: |
Converted Value: |
Comments: |
INDICATORS |
Delete |
|
SUBFIELD DELIMITERS |
Hex 1F |
Subfield delimiters are retained, except the
first delimiter, which precedes the data string, would be dropped. |
SUBFIELD CODES |
Delete |
|
TAG |
|
The tag is used to point to the population against
which a normalized string is to be compared for a uniqueness
edit (see Authority File Comparison Rules). |
(prepared by LC Network Development and MARC Standards Office)
|