Library of Congress >> Standards

ISO639-2

Codes for the
Representation of Names
of Languages-Part 2:
Alpha-3 Code


Normative Text

1 Scope

This part of ISO 639 provides two sets of three-letter alphabetic codes for the representation of names of languages, one for terminology applications and the other for bibliographic applications. The code sets are the same except for twenty-three languages that have variant language codes because of the criteria used for formulating them (see 4.1). The language codes were devised originally for use by libraries, information services, and publishers to indicate language in the exchange of information, especially in computerized systems. These codes have been widely used in the library community and may be adopted for any application requiring the expression of language in coded form by terminologists and lexicographers. The alpha-2 code set was devised for practical use for most of the major languages of the world that are most frequently represented in the total body of the world's literature. Additional language codes are created when it becomes apparent that a significant body of literature in a particular language exists. Languages designed exclusively for machine use, such as computer programming languages, are not included in this code.

2 Normative reference

The following standard contains provisions which, through reference in this text, constitute provisions of this part of ISO 639. At the time of publication, the edition indicated was valid. All standards are subject to revision, and parties to agreements based on this part of ISO 639 are encouraged to investigate the possibility of applying the most recent edition of the standard indicated below. Members of IEC and ISO maintain registers of currently valid International Standards.

ISO 3166-1:1997, Codes for the representation of names of countries and their subdivisions-Part 1: Country codes.

3 Definitions

For the purpose of this part of ISO 639, the following definitions apply:

3.1
code

data representation in different forms according to a pre-established set of rules

3.2
language code

combination of characters used to represent a language or languages

3.3
collective language code

language code used to represent a group of languages

Language codes

4.1 Form of the language codes

The language codes consist of three Latin-alphabet characters in lowercase. No diacritical marks or modified characters are used. Implementors should be aware that these codes are not intended to be an abbreviation for the language, but to serve as a device to identify a given language or group of languages. The language codes are derived from the language name.

Two code sets are provided, one for bibliographic applications (ISO 639-2/B), and one for terminology applications (ISO 639-2/T). Criteria for selecting the form of a language code for code set B were:

  • preference of the countries using the language
  • established usage of codes in national and international bibliographic databases, and
  • the vernacular or English form of the language.

Code set T was based on:

  • the vernacular form of the language, or
  • preference of the countries using the language.

There are twenty-three language names that have variant codes assigned depending on the code set chosen.

Future development of language codes will be based whenever possible on the vernacular form of the language, unless another language code is requested by the country or countries using the language.

The bibliographic or terminology code set must be used in its entirety, and the choice of the set used must be made clear by exchanging partners prior to information interchange. Users shall refer to ISO 639-2/B for the code set for bibliographic applications and ISO 639-2/T for the code set for terminology applications.

To ensure continuity and stability, codes shall only be changed for compelling reasons. After a change in codes, the previous code shall not be reassigned for at least five years.To accommodate large applications that build continuously, the codes in ISO 639-2/B shall not be changed if a language name or its abbreviation are changed.

When adapting this part of ISO 639 to languages using other alphabets (e. g. Cyrillic), codes shall be formed according to the principles of this part of ISO 639.

4.1.1 Collective language codes

Collective language codes are provided for languages where a relatively small number of documents exist or are expected to be written, recorded or created. The words languages or (other) as part of a language name in the following tables may be taken to indicate that a language code is a collective language code. A collective language code is not intended to be used when an individual language code or another more specific collective language code is available.

4.1.2 Special situations

The language code mul (for multiple languages) should be applied when several languages are used and it is not practical to specify all the appropriate language codes.

The language code und (for undetermined) is provided for those situations in which a language or languages must be indicated but the language cannot be identified.

4.1.3 Scripts and dialects

A single language code is normally provided for a language even though the language is written in more than one script. A separate standard may be developed for the purpose of designating information concerning the script or writing system of a language.

The dialect of a language is usually represented by the same language code as that used for the language. If the language is assigned to a collective language code, the dialect is assigned to the same collective language code. If the language has an individual language code, the dialect is also included in that code rather than the code for the group to which both belong. In a few instances, however, both the language and a dialect of that language have their own individual language codes.

4.1.4 Local Codes

Codes qaa through qtz are reserved for local use, including for local treatment of dialects. These codes may only be used locally, and may not be exchanged internationally.

4.1.5 Ancient languages

Ancient languages that are not given individual language codes are assigned the code for the major language collective to which each belongs, rather than the code for the modern language which evolved from the ancient language. For example, Old Frisian is assigned the language code gem for the language group Germanic (Other) instead of the language code fry for the modern language Frisian.

In cases of doubt, assistance may be sought from the Registration Authority.

4.2 Registration of new language codes

The Registration Authority for this part of ISO 639 shall be the Library of Congress, Washington, D.C. 20540-4102 USA (c/o Network Development and MARC Standards Office)*1. The Registration Authority for ISO 639-1 is Infoterm, Simmeringer Hauptstrasse 24, A-1110 Vienna, Austria.

4.3 Application of language codes

Language codes can be used in the following specific instances. Examples of how language codes may be used follow each.

4.3.1

To indicate the languages in which documents are or have been written or recorded.

EXAMPLE
UNIMARC Format *2, Field 101 Language of the item
A pamphlet is issued in Danish (language code dan used for Danish)
Field 101: 0#$adan

4.3.2

To indicate the languages in which document-handling records (order records, bibliographic records, and the like) have been created.

EXAMPLE
UNIMARC Format *2, Field 100 Language of cataloguing;
positions
22-24
A book is written in Dutch, but the catalog record is in English
(language code eng used for English)
Field 100 positions 22-24: eng

4.3.3

To indicate the language-speaking capabilities of delegates to a meeting. (Alternatively, another international standard list may be used, for example ISO 639-1.

EXAMPLE - In a list of delegates issued at an ISO meeting, the codes eng, fre and rus indicate whether the delegates spoke in (eng) English, (fre) French, or (rus) Russian.

4.3.4

To indicate the original language of a document.

EXAMPLE - Document of the United Nations:
ST/DCS/1/Rev. 2 eng fre
[Document is bilingual.]

4.4 Application of the country code

Country codes from ISO 3166 may be combined with language codes to denote the area in which a term, phrase, or language is used.

EXAMPLES

  • a spool of thread (eng US)
  • a bobbin of cotton (eng GB)

5 Structure of the list of language codes

The list of language codes is presented in three tables:

Table 1: Alpha-3 codes arranged alphabetically by English name of language
Table 2: Alpha-3 codes arranged alphabetically by French name of language
Table 3: Alpha-3 codes arranged alphabetically by ISO 639-2 code

*1 Subject to approval of ISO Council.

*2 The UNIversal MAchine Readable Cataloging format is used for exchange of bibliographic data.

Top of Document

Comments on this document: iso639-2@loc.gov



Library of Congress >> Standards

Contact Us
June 2, 2006