Skip
repetitive navigational links
L-Soft  -  Home of  the  LISTSERV  mailing list  manager LISTSERV(R) 14.5
Skip repetitive navigational links
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2005)Back to main UNICODE-MARC pageJoin or leave UNICODE-MARCReplyPost a new messageSearchProportional fontNon-proportional fontLog in
Date:         Thu, 8 Dec 2005 17:09:31 -0500
Reply-To:     UNICODE-MARC Discussion List <[log in to unmask]>
Sender:       UNICODE-MARC Discussion List <[log in to unmask]>
From:         Daniel Lovins <[log in to unmask]>
Subject:      Re: MARC Filing and Unicode Exclusion
Comments: To: [log in to unmask]
Comments: cc: [log in to unmask], charles Riley <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="us-ascii"; format=flowed

Dear group, I have question about whether certain Unicode characters--in my case, certain Hebrew ones--are represented by double bytes in UTF-8, and if so, whether this would explain the following situation: While testing the Unicode release of Endeavor Voyager, a member of my team, Jerry Anne Dickel, found that Hebrew script titles beginning with definite (and for Yiddish, also indefinite) articles, no longer indexed properly. The titles were failing to show up in browse displays. Jerry Anne was able to implicate the second indicator of the 245 field (= number of non-filing characters) in this: Ordinarily, with the Hebrew article "ha" [a one character prefix], the second indicator of the 245 would be 1, but it was only when Jerry Anne changed it to a 2 that the title once again indexed correctly. The same thing happened with the Yiddish definite article "der" (3 letters plus a space as in the Latin script), where the numeral 4 (representing the three letters plus space) would normally be used in the second indicator; in Voyager Unicode, however, the title would only index if the 4 were replaced by a 7 (i.e., doubling the Hebrew characters (3x2) but not the space). We replicated the problem in LC's Unicode-compliant Voyager and in OCLC WorldCat. Interestingly, there did not seem to be a problem in RLIN21. Did RLG anticipate (what I'm assuming is) the doubled bytes and apply a fix? Alternatively, do you think it might be something other than byte number that's causing the problem? Thank you very much for your help. Daniel >------------------------------------ Daniel Lovins Hebraica Team Leader Catalog Department Sterling Memorial Library Yale University PO Box 208240 New Haven, CT 06520 tel: 203/432-1707 fax: 203/432-7231


Back to: Top of message | Previous page | Main UNICODE-MARC page

LISTSERV.LOC.GOV CataList email list search Powered by LISTSERV email list manager