Displaying and Searching Non-Roman
Characters in the Online Catalog (Unicode)
Bibliographic records for items in most languages will display
correctly without changing any settings in your preferred
Web browser.
The instructions below may be helpful if you want to view records
containing non-Roman characters (e.g., Unicode characters) in any
of the JACKPHY languages (e.g., Japanese, Arabic, Chinese, Korean,
Persian,
Hebrew,
or Yiddish) and Russian.*
View Special Help Screens for the following languages:
Japanese -
Chinese - Korean - Arabic - Persian - Hebrew/Yiddish new!
* Note that there are only a small number of records with Cyrillic characters in the LC Online Catalog. All records in languages that use Cyrillic are romanized and can only be retrieved using Roman (Latin) characters in searches. The Library uses the Russian keyboard to search other electronic resources.
Special Note on
Printing, Saving and Emailing: It is possible to successfully
Print or Save non-Roman characters using the "Save, Print or Email Records" function of the Online Catalog (found at the bottom of search results and single
record displays). Best results occur if you have Unicode fonts
and use Unicode character encoding with automatic character encoding
activated.
When printing or saving records, please make sure your browser's character encoding
(on the View pulldown menu) is set to:
- Text (Brief Information) Unicode (UTF-8)
- Text (Full Information) Unicode (UTF-8)
- MARC (non-Unicode/MARC-8) Western or Western
European
- MARC (Unicode/UTF-8) Unicode (UTF-8)
At this time, it is not possible to Email records
containing non-Roman characters or words including any diacritic
marks.
CJK Compatibility Database
Use the CJK Compatibility Database to quickly
and conveniently replace non-MARC21 characters with MARC21 equivalents,
or a missing character symbol.
- Link
to the CJK Compatibility database
- More
about the CJK Compatibility database
Instructions for Windows
Installing
the Unicode Font in Windows XP
If you are using Microsoft Windows XP, the "universal
font" for Unicode should be automatically installed.
Arial Unicode MS font is a "full Unicode font"
-- it contains all of the characters, ideographs, and symbols defined
in the Unicode 2.1
standard.*
* Unicode is a character
encoding standard developed by the Unicode Consortium. By using
more than one byte to represent each character, Unicode enables
almost all of the written languages in the world to be represented
by using a single character set.
If the "universal font" is not visible (i.e., you cannot see non-Roman
characters), you will need to set the Latin-based font to Arial
Unicode MS (if it is not available on your system, you
may need to install it -- it will be called "Universal Font" on
the installation disks.)
Installing the Unicode Font in Windows 2000
To display non-Roman characters:
- Display the Windows Start menu
- Select Settings > Control Panel > Regional
Options
- Within the General tab, check
all of the languages you may want to display;
*
the more
you set,
the more you will
be able
to process multilingual
data through all your applications, including your browser. This
adds fonts as well as system support for these languages.
- Select OK to accept your selections
- You will have to reboot your system for the changes to take
effect.

To install the "universal" font: If you have Microsoft Office 2000 and newer versions,
you can install the Arial Unicode MS font, which supports display of most of the non-Roman characters.
If you don't already have this font:
- Insert the Office CD, and select "custom install."
- Choose Add
or Remove Features.
- Click the (+) next to Office Tools, then
International Support, then the Universal Font icon, and
choose the installation
option you want.
More Information for Windows 2000 Users:
http://office.microsoft.com/en-us/assistance/HP052558401033.aspx
Displaying Non-Roman Characters in Web Browsers
Setting up the browser to display non-Roman characters is a 2-step
process. Begin by setting the default font to Arial
Unicode MS:
STEP 1:
To set the font in Internet Explorer, from the Tools pulldown
menu:
- Select Internet Options --> Fonts
- Select "Latin based" from the Language script:
menu
- Select "Arial Unicode MS" from the Web page
font: menu
- Select "OK" to save the change.

To set the font in Firefox, from the Tools pulldown
menu
- Select Options --> General
- Select the button for Fonts & Colors
- Select "Western" in the Fonts for: drop-down
menu
- Select "Arial Unicode MS" in
the Sans-serif: drop-down menu.
- Select "OK" to save the change.

STEP 2:
When viewing catalog records, you will also need to make
sure that the character encoding for the page you are looking at
is set to Unicode (UTF-8). Often, the browser
sets the encoding automatically. You may
also have to choose the setting yourself, and this setting doesn't
always "stick" (so you may have to reset it).
To set UTF-8 encoding in Internet Explorer...
from the View pulldown menu
select Encoding > Unicode (UTF-8)

Also, make sure that Auto-Select is
unmarked. Please note, if "Unicode (UTF-8) is not currently displaying
in your Encoding menu, select "More" to find it.
To set UTF-8 encoding in Firefox... from the View pulldown
menu
select Character Encoding -->Unicode (UTF-8). 
In this case, you can set the Auto-Detect option to "Universal." Please
note, if "Unicode
(UTF-8)" is not currently displaying in your Character
Encoding menu,
select "More
Encodings" to
find it.
Installing IMEs for Entering Non-Roman
Characters
If you want to search using non-Roman characters, you
will need to install the appropriate input keyboard layouts for
the languages
you
wish to search in. These keyboards layouts are called Input
Method Editors (IMEs).
No rebooting or special privileges are needed to add them to your
system. In order to install them, go to:
Start Menu > Settings > Control Panel > Regional
Options (General)
Then put a check mark next to all of the languages you want to
use.
The Library of Congress currently uses the following IMEs:
Arabic (Egypt) -- Arabic (101)
Chinese (PRC) -- Chinese (Simplified) - US Keyboard
Chinese (Taiwan) -- Chinese (Simplified) - US Keyboard
Hebrew -- Hebrew
Japanese -- Japanese Input System (MS-IME2000)
Korean -- Korean (Hangul) (MS-IME-98)
Russian*
* Note that there are no Cyrillic characters in the LC Online
Catalog. All records in languages that use Cyrillic are romanized
and can only be retrieved using Roman (Latin) characters in searches.
The Library uses the Russian keyboard to search other electronic
resources. For more information on IMEs, please see:
Instructions for Macintosh Your operating system should be OS 10.3.x or higher. Additionally,
the most current version of your browser should be used; check
the Web to be sure there are no updates available. As of May
2005 this includes Firefox 5, Safari 1.3, and Netscape
7.0. None
will work perfectly for all purposes, but all three are adequate
for most purposes.
Be sure the Lucida Grande font is installed (a default font within
OS X).

When viewing catalog records, make sure that the character
encoding for the page you are looking at is set to Unicode
(UTF-8).
Often, the browser sets the encoding automatically. You may occasionally
have to choose the setting yourself. In Firefox - From the View pulldown menu
Select Character Encoding
-->Unicode (UTF-8)
In Safari - From the View pulldown menu
Select Text Encoding-->Unicode
(UTF-8)
In Netscape - From the View pulldown menu
Select Character Coding
-->Unicode (UTF-8)
Support for bidirectional scripts in Mac OS X browsers depends
upon the functionality provided by the version of the browser you
use.
Copying and pasting of catalog records with special diacritics,
characters, or scripts into other applications depend on the ability
of those applications to handle Unicode and bidirectional scripts.
Input Methods for Non-Roman Characters
Built into Mac OS X is support for the inputting on non-Roman
characters. To enable these features, go to System Preferences
and select "International":

The following screen displays. Select the "Input Menu" tab. Select
input methods for languages you need. Make sure to check the box
to "Show input menu in menu bar."

In order to use a language-specific input method, look for the
"flag" icon in the menu bar for the application you are using:
 More on Non-Roman Characters on Macintosh Computers
This information comes from the Unicode Web site:
http://www.unicode.org/help/display_problems.html
On Mac OS X, the Safari Web browser includes Unicode
support as does OmniWeb. OmniWeb,
however, does not currently provide support for all of Unicode
(it can, however,
take advantage of Unicode fonts for Windows if properly installed).
Earlier Versions
There are currently no Web browsers which provide direct Unicode
drawing (font support) on the Mac OS 9.x or earlier. All the browsers
use Apple Language Kits and WorldScript to varying degrees to support
Unicode
and international text.
Language Kits are installed using your Mac OS 9.x installation
CD. Launch the Mac OS Install application. Proceed through the
initial screens, selecting the appropriate boot disk. When you
reach the "Install Software" screen, click on the "Customize" button.
This opens up the Custom Installation and Removal dialog box.
Scroll down to "Language Kits." Click on the check box,
and then select "Customized Installation" from the installation
popup to the right. (It will say "None selected" at first.)
This brings up a dialog box with a list of all the available language
kits. Select the ones you want, or use the menu at the top of the
dialog box to select all of them. Proceed with the installation.
If you already have Mac OS 9.0 installed, you will be asked if
you want to add or remove software after you select the installation
disk. Click on the "Add/Remove" button. This will bring
you to the Custom Installation and Removal dialog box.
The installation procedure is the same for Mac OS 8.6, except
that you will be installing "Multilingual Internet Access" instead
of Language Kits. For Mac OS 8.5.5 and earlier, it will be necessary
to purchase the individual language kits.
Use the browser's [Back] button to resume searching. >>Top of Page
|