Library of Congress Online Catalog: Help Pages

  Online Catalog Home

  About the Catalog
  FAQs - What's New
  Known Problems

  UNICODE:
  Displaying Non-Roman
  Characters

  SEARCHING:
  Basic Search Overview
  Keyword Searches:
  - Keyword
  - Title Keyword
  - Author/Creator
  - Subject Keyword
  - Name/Title
  - Series/Uniform Title
  Other Searches:
  - Expert Search
  - Title Begins With
  - Author/Creator Browse
  - Subject Browse
  - Call No. Browse (LC)
  - Call No. Browse (Other)
  - Number Search
  - Author Sorted by Title
  Guided Search


  Limiting Searches
  Search History
  Boolean Searching

  INDEX CODES:
  for Guided Searches
  for Expert Searches

  SEARCH RESULTS:
  Brief Record
  Subjects/Content
  Full Record
  MARC Tags


  Headings List
  Titles List
  Titles List (Keyword)
  References/Scope Notes
  Save, Print, Email

  OTHER TOPICS:
  Headings List Types
  Material Types
  Database Selection
  Icons, Buttons & Tabs
  Session Time-Outs

Displaying and Searching Non-Roman Characters in the Online Catalog (Unicode)

Bibliographic records for items in most languages will display correctly without changing any settings in your preferred Web browser. The instructions below may be helpful if you want to view records containing non-Roman characters (e.g., Unicode characters) in any of the JACKPHY languages (e.g., Japanese, Arabic, Chinese, Korean, Persian, Hebrew, or Yiddish) and Russian.*

View Special Help Screens for the following languages:
Japanese - Chinese - Korean - Arabic - Persian - Hebrew/Yiddish new!

* Note that there are only a small number of records with Cyrillic characters in the LC Online Catalog. All records in languages that use Cyrillic are romanized and can only be retrieved using Roman (Latin) characters in searches. The Library uses the Russian keyboard to search other electronic resources.

Special Note on Printing, Saving and Emailing: It is possible to successfully Print or Save non-Roman characters using the "Save, Print or Email Records" function of the Online Catalog (found at the bottom of search results and single record displays). Best results occur if you have Unicode fonts and use Unicode character encoding with automatic character encoding activated.

When printing or saving records, please make sure your browser's character encoding (on the View pulldown menu) is set to:

  • Text (Brief Information) Unicode (UTF-8)
  • Text (Full Information) Unicode (UTF-8)
  • MARC (non-Unicode/MARC-8) Western or Western European
  • MARC (Unicode/UTF-8) Unicode (UTF-8)

At this time, it is not possible to Email records containing non-Roman characters or words including any diacritic marks.

CJK Compatibility Database

Use the CJK Compatibility Database to quickly and conveniently replace non-MARC21 characters with MARC21 equivalents, or a missing character symbol.
- Link to the CJK Compatibility database
- More about the CJK Compatibility database

Instructions for Windows

Installing the Unicode Font in Windows XP

If you are using Microsoft Windows XP, the "universal font" for Unicode should be automatically installed.

Arial Unicode MS font is a "full Unicode font" -- it contains all of the characters, ideographs, and symbols defined in the Unicode 2.1 standard.*

* Unicode is a character encoding standard developed by the Unicode Consortium. By using more than one byte to represent each character, Unicode enables almost all of the written languages in the world to be represented by using a single character set.

If the "universal font" is not visible (i.e., you cannot see non-Roman characters), you will need to set the Latin-based font to Arial Unicode MS (if it is not available on your system, you may need to install it -- it will be called "Universal Font" on the installation disks.)

Installing the Unicode Font in Windows 2000

To display non-Roman characters:

  1. Display the Windows Start menu
  2. Select Settings > Control Panel > Regional Options
  3. Within the General tab, check all of the languages you may want to display;
    * the more you set, the more you will be able to process multilingual data through all your applications, including your browser. This adds fonts as well as system support for these languages.
  4. Select OK to accept your selections
  5. You will have to reboot your system for the changes to take effect.

Screen: Regional Options

To install the "universal" font:

If you have Microsoft Office 2000 and newer versions, you can install the Arial Unicode MS font, which supports display of most of the non-Roman characters. If you don't already have this font:

  1. Insert the Office CD, and select "custom install."
  2. Choose Add or Remove Features.
  3. Click the (+) next to Office Tools, then International Support, then the Universal Font icon, and choose the installation option you want.

More Information for Windows 2000 Users:

http://office.microsoft.com/en-us/assistance/HP052558401033.aspx

Displaying Non-Roman Characters in Web Browsers

Setting up the browser to display non-Roman characters is a 2-step process. Begin by setting the default font to Arial Unicode MS:

STEP 1:

To set the font in Internet Explorer, from the Tools pulldown menu:

  1. Select Internet Options --> Fonts
  2. Select "Latin based" from the Language script: menu
  3. Select "Arial Unicode MS" from the Web page font: menu
  4. Select "OK" to save the change.

Screen - Set Arial Unicode font

To set the font in Firefox, from the Tools pulldown menu

  1. Select Options --> General
  2. Select the button for Fonts & Colors
  3. Select "Western" in the Fonts for: drop-down menu
  4. Select "Arial Unicode MS" in the Sans-serif: drop-down menu.
  5. Select "OK" to save the change.

Screen - set font to unicode

STEP 2:

When viewing catalog records, you will also need to make sure that the character encoding for the page you are looking at is set to Unicode (UTF-8). Often, the browser sets the encoding automatically. You may also have to choose the setting yourself, and this setting doesn't always "stick" (so you may have to reset it).

To set UTF-8 encoding in Internet Explorer... from the View pulldown menu
   select Encoding > Unicode (UTF-8)

Screen: Set Encoding to UTF-8 in IE

Also, make sure that Auto-Select is unmarked. Please note, if "Unicode (UTF-8) is not currently displaying in your Encoding menu, select "More" to find it.

To set UTF-8 encoding in Firefox... from the View pulldown menu
   select Character Encoding -->Unicode (UTF-8).

Screen: Set Encoding to UTF-8 in FireFox

In this case, you can set the Auto-Detect option to "Universal." Please note, if "Unicode (UTF-8)" is not currently displaying in your Character Encoding menu, select "More Encodings" to find it.

Installing IMEs for Entering Non-Roman Characters

If you want to search using non-Roman characters, you will need to install the appropriate input keyboard layouts for the languages you wish to search in. These keyboards layouts are called Input Method Editors (IMEs).

No rebooting or special privileges are needed to add them to your system. In order to install them, go to:

Start Menu > Settings > Control Panel > Regional Options (General)

Then put a check mark next to all of the languages you want to use.

The Library of Congress currently uses the following IMEs:

Arabic (Egypt) -- Arabic (101)
Chinese (PRC) -- Chinese (Simplified) - US Keyboard
Chinese (Taiwan) -- Chinese (Simplified) - US Keyboard
Hebrew -- Hebrew
Japanese -- Japanese Input System (MS-IME2000)
Korean -- Korean (Hangul) (MS-IME-98)
Russian*

* Note that there are no Cyrillic characters in the LC Online Catalog. All records in languages that use Cyrillic are romanized and can only be retrieved using Roman (Latin) characters in searches. The Library uses the Russian keyboard to search other electronic resources.

For more information on IMEs, please see:


Instructions for Macintosh

Your operating system should be OS 10.3.x or higher. Additionally, the most current version of your browser should be used; check the Web to be sure there are no updates available. As of May 2005 this includes Firefox 5, Safari 1.3, and Netscape 7.0. None will work perfectly for all purposes, but all three are adequate for most purposes.

Be sure the Lucida Grande font is installed (a default font within OS X).

Screen: Font Book in Mac OS X

When viewing catalog records, make sure that the character encoding for the page you are looking at is set to Unicode (UTF-8). Often, the browser sets the encoding automatically. You may occasionally have to choose the setting yourself.

In Firefox - From the View pulldown menu
   Select Character Encoding -->Unicode (UTF-8)

In Safari - From the View pulldown menu
   Select Text Encoding-->Unicode (UTF-8)

In Netscape - From the View pulldown menu
    Select Character Coding -->Unicode (UTF-8)

Support for bidirectional scripts in Mac OS X browsers depends upon the functionality provided by the version of the browser you use.

Copying and pasting of catalog records with special diacritics, characters, or scripts into other applications depend on the ability of those applications to handle Unicode and bidirectional scripts.

Input Methods for Non-Roman Characters

Built into Mac OS X is support for the inputting on non-Roman characters. To enable these features, go to System Preferences and select "International":

Screen: System Preferences for Mac OS X

The following screen displays. Select the "Input Menu" tab. Select input methods for languages you need. Make sure to check the box to "Show input menu in menu bar."

Screen: International options for Mac OS X

In order to use a language-specific input method, look for the "flag" icon in the menu bar for the application you are using:

Screen: Flag Icon for Mac OS X

More on Non-Roman Characters on Macintosh Computers

This information comes from the Unicode Web site:
http://www.unicode.org/help/display_problems.html

On Mac OS X, the Safari Web browser includes Unicode support as does OmniWeb. OmniWeb, however, does not currently provide support for all of Unicode (it can, however, take advantage of Unicode fonts for Windows if properly installed).

Earlier Versions

There are currently no Web browsers which provide direct Unicode drawing (font support) on the Mac OS 9.x or earlier. All the browsers use Apple Language Kits and WorldScript to varying degrees to support Unicode and international text.

Language Kits are installed using your Mac OS 9.x installation CD. Launch the Mac OS Install application. Proceed through the initial screens, selecting the appropriate boot disk. When you reach the "Install Software" screen, click on the "Customize" button. This opens up the Custom Installation and Removal dialog box.

Scroll down to "Language Kits." Click on the check box, and then select "Customized Installation" from the installation popup to the right. (It will say "None selected" at first.)

This brings up a dialog box with a list of all the available language kits. Select the ones you want, or use the menu at the top of the dialog box to select all of them. Proceed with the installation.

If you already have Mac OS 9.0 installed, you will be asked if you want to add or remove software after you select the installation disk. Click on the "Add/Remove" button. This will bring you to the Custom Installation and Removal dialog box.

The installation procedure is the same for Mac OS 8.6, except that you will be installing "Multilingual Internet Access" instead of Language Kits. For Mac OS 8.5.5 and earlier, it will be necessary to purchase the individual language kits.

Use the browser's [Back] button to resume searching.

>>Top of Page

  Still Need Help?
  Ask a Librarian

  Reporting Catalog Errors?
  Error Report Form

Library of Congress Online Catalog - catalog.loc.gov
Library of Congress Home Page
May 20, 2008