Displaying and Searching Non-Roman
Characters in the Online Catalog (Unicode)
Bibliographic records for items in most languages will display
correctly without changing any settings in your preferred
The instructions below may be helpful if you want to view records
containing non-Roman characters (e.g., Unicode characters) in any
of the JACKPHY languages (e.g., Japanese, Arabic, Chinese, Korean,
or Yiddish) and Russian.*
View Special Help Screens for the following languages:
Chinese - Korean - Arabic - Persian - Hebrew/Yiddish new!
* Note that there are only a small number of records with Cyrillic characters in the LC Online Catalog. All records in languages that use Cyrillic are romanized and can only be retrieved using Roman (Latin) characters in searches. The Library uses the Russian keyboard to search other electronic resources.
Special Note on
Printing, Saving and Emailing: It is possible to successfully
Print or Save non-Roman characters using the "Save, Print or Email Records" function of the Online Catalog (found at the bottom of search results and single
record displays). Best results occur if you have Unicode fonts
and use Unicode character encoding with automatic character encoding
When printing or saving records, please make sure your browser's character encoding
(on the View pulldown menu) is set to:
- Text (Brief Information) Unicode (UTF-8)
- Text (Full Information) Unicode (UTF-8)
- MARC (non-Unicode/MARC-8) Western or Western
- MARC (Unicode/UTF-8) Unicode (UTF-8)
At this time, it is not possible to Email records
containing non-Roman characters or words including any diacritic
CJK Compatibility Database
Use the CJK Compatibility Database to quickly
and conveniently replace non-MARC21 characters with MARC21 equivalents,
or a missing character symbol.
to the CJK Compatibility database
about the CJK Compatibility database
Instructions for Windows
the Unicode Font in Windows XP
If you are using Microsoft Windows XP, the "universal
font" for Unicode should be automatically installed.
Arial Unicode MS font is a "full Unicode font"
-- it contains all of the characters, ideographs, and symbols defined
in the Unicode 2.1
* Unicode is a character
encoding standard developed by the Unicode Consortium. By using
more than one byte to represent each character, Unicode enables
almost all of the written languages in the world to be represented
by using a single character set.
If the "universal font" is not visible (i.e., you cannot see non-Roman
characters), you will need to set the Latin-based font to Arial
Unicode MS (if it is not available on your system, you
may need to install it -- it will be called "Universal Font" on
the installation disks.)
Installing the Unicode Font in Windows 2000
To display non-Roman characters:
- Display the Windows Start menu
- Select Settings > Control Panel > Regional
- Within the General tab, check
all of the languages you may want to display;
the more you will
to process multilingual
data through all your applications, including your browser. This
adds fonts as well as system support for these languages.
- Select OK to accept your selections
- You will have to reboot your system for the changes to take
To install the "universal" font:
If you have Microsoft Office 2000 and newer versions,
you can install the Arial Unicode MS font, which supports display of most of the non-Roman characters.
If you don't already have this font:
- Insert the Office CD, and select "custom install."
- Choose Add
or Remove Features.
- Click the (+) next to Office Tools, then
International Support, then the Universal Font icon, and
choose the installation
option you want.
More Information for Windows 2000 Users:
Displaying Non-Roman Characters in Web Browsers
Setting up the browser to display non-Roman characters is a 2-step
process. Begin by setting the default font to Arial
To set the font in Internet Explorer, from the Tools pulldown
- Select Internet Options --> Fonts
- Select "Latin based" from the Language script:
- Select "Arial Unicode MS" from the Web page
- Select "OK" to save the change.
To set the font in Firefox, from the Tools pulldown
- Select Options --> General
- Select the button for Fonts & Colors
- Select "Western" in the Fonts for: drop-down
- Select "Arial Unicode MS" in
the Sans-serif: drop-down menu.
- Select "OK" to save the change.
When viewing catalog records, you will also need to make
sure that the character encoding for the page you are looking at
is set to Unicode (UTF-8). Often, the browser
sets the encoding automatically. You may
also have to choose the setting yourself, and this setting doesn't
always "stick" (so you may have to reset it).
To set UTF-8 encoding in Internet Explorer...
from the View pulldown menu
select Encoding > Unicode (UTF-8)
Also, make sure that Auto-Select is
unmarked. Please note, if "Unicode (UTF-8) is not currently displaying
in your Encoding menu, select "More" to find it.
To set UTF-8 encoding in Firefox... from the View pulldown
select Character Encoding -->Unicode (UTF-8).
In this case, you can set the Auto-Detect option to "Universal." Please
note, if "Unicode
(UTF-8)" is not currently displaying in your Character
Installing IMEs for Entering Non-Roman
If you want to search using non-Roman characters, you
will need to install the appropriate input keyboard layouts for
wish to search in. These keyboards layouts are called Input
Method Editors (IMEs).
No rebooting or special privileges are needed to add them to your
system. In order to install them, go to:
Start Menu > Settings > Control Panel > Regional
Then put a check mark next to all of the languages you want to
The Library of Congress currently uses the following IMEs:
Arabic (Egypt) -- Arabic (101)
Chinese (PRC) -- Chinese (Simplified) - US Keyboard
Chinese (Taiwan) -- Chinese (Simplified) - US Keyboard
Hebrew -- Hebrew
Japanese -- Japanese Input System (MS-IME2000)
Korean -- Korean (Hangul) (MS-IME-98)
* Note that there are no Cyrillic characters in the LC Online
Catalog. All records in languages that use Cyrillic are romanized
and can only be retrieved using Roman (Latin) characters in searches.
The Library uses the Russian keyboard to search other electronic
For more information on IMEs, please see:
Instructions for Macintosh
Your operating system should be OS 10.3.x or higher. Additionally,
the most current version of your browser should be used; check
the Web to be sure there are no updates available. As of May
2005 this includes Firefox 5, Safari 1.3, and Netscape
will work perfectly for all purposes, but all three are adequate
for most purposes.
Be sure the Lucida Grande font is installed (a default font within
When viewing catalog records, make sure that the character
encoding for the page you are looking at is set to Unicode
Often, the browser sets the encoding automatically. You may occasionally
have to choose the setting yourself.
In Firefox - From the View pulldown menu
Select Character Encoding
In Safari - From the View pulldown menu
Select Text Encoding-->Unicode
In Netscape - From the View pulldown menu
Select Character Coding
Support for bidirectional scripts in Mac OS X browsers depends
upon the functionality provided by the version of the browser you
Copying and pasting of catalog records with special diacritics,
characters, or scripts into other applications depend on the ability
of those applications to handle Unicode and bidirectional scripts.
Input Methods for Non-Roman Characters
Built into Mac OS X is support for the inputting on non-Roman
characters. To enable these features, go to System Preferences
and select "International":
The following screen displays. Select the "Input Menu" tab. Select
input methods for languages you need. Make sure to check the box
to "Show input menu in menu bar."
In order to use a language-specific input method, look for the
"flag" icon in the menu bar for the application you are using:
More on Non-Roman Characters on Macintosh Computers
This information comes from the Unicode Web site:
On Mac OS X, the Safari Web browser includes Unicode
support as does OmniWeb. OmniWeb,
however, does not currently provide support for all of Unicode
(it can, however,
take advantage of Unicode fonts for Windows if properly installed).
There are currently no Web browsers which provide direct Unicode
drawing (font support) on the Mac OS 9.x or earlier. All the browsers
use Apple Language Kits and WorldScript to varying degrees to support
and international text.
Language Kits are installed using your Mac OS 9.x installation
CD. Launch the Mac OS Install application. Proceed through the
initial screens, selecting the appropriate boot disk. When you
reach the "Install Software" screen, click on the "Customize" button.
This opens up the Custom Installation and Removal dialog box.
Scroll down to "Language Kits." Click on the check box,
and then select "Customized Installation" from the installation
popup to the right. (It will say "None selected" at first.)
This brings up a dialog box with a list of all the available language
kits. Select the ones you want, or use the menu at the top of the
dialog box to select all of them. Proceed with the installation.
If you already have Mac OS 9.0 installed, you will be asked if
you want to add or remove software after you select the installation
disk. Click on the "Add/Remove" button. This will bring
you to the Custom Installation and Removal dialog box.
The installation procedure is the same for Mac OS 8.6, except
that you will be installing "Multilingual Internet Access" instead
of Language Kits. For Mac OS 8.5.5 and earlier, it will be necessary
to purchase the individual language kits.
Use the browser's [Back] button to resume searching.
>>Top of Page