Codes for the
Representation of Names
of Languages-Part 2:
This part of ISO 639 provides two sets of three-letter alphabetic codes
for the representation of names of languages, one for terminology applications
and the other for bibliographic applications. The code sets are the same
except for twenty-three languages that have variant language codes because
of the criteria used for formulating them (see 4.1).
The language codes were devised originally for use by libraries, information
services, and publishers to indicate language in the exchange of information,
especially in computerized systems. These codes have been widely used
in the library community and may be adopted for any application requiring
the expression of language in coded form by terminologists and lexicographers.
The alpha-2 code set was devised for practical use for most of the major
languages of the world that are most frequently represented in the total
body of the world's literature. Additional language codes are created
when it becomes apparent that a significant body of literature in a particular
language exists. Languages designed exclusively for machine use, such
as computer programming languages, are not included in this code.
2 Normative reference
The following standard contains provisions which, through reference
in this text, constitute provisions of this part of ISO 639. At the time
of publication, the edition indicated was valid. All standards are subject
to revision, and parties to agreements based on this part of ISO 639
are encouraged to investigate the possibility of applying the most recent
edition of the standard indicated below. Members of IEC and ISO maintain
registers of currently valid International Standards.
ISO 3166-1:1997, Codes for the representation of names of countries
and their subdivisions-Part 1: Country codes.
For the purpose of this part of ISO 639, the following definitions apply:
data representation in different forms according to a pre-established
set of rules
combination of characters used to represent a language or languages
collective language code
language code used to represent a group of languages
The language codes consist of three Latin-alphabet characters in lowercase.
No diacritical marks or modified characters are used. Implementors should
be aware that these codes are not intended to be an abbreviation for
the language, but to serve as a device to identify a given language or
group of languages. The language codes are derived from the language
Two code sets are provided, one for bibliographic applications (ISO
639-2/B), and one for terminology applications (ISO 639-2/T). Criteria
for selecting the form of a language code for code set B were:
- preference of the countries using the language
- established usage of codes in national and international bibliographic
- the vernacular or English form of the language.
Code set T was based on:
- the vernacular form of the language, or
- preference of the countries using the language.
There are twenty-three language names that have variant codes assigned
depending on the code set chosen.
Future development of language codes will be based whenever possible
on the vernacular form of the language, unless another language code
is requested by the country or countries using the language.
The bibliographic or terminology code set must be used in its entirety,
and the choice of the set used must be made clear by exchanging partners
prior to information interchange. Users shall refer to ISO 639-2/B for
the code set for bibliographic applications and ISO 639-2/T for the code
set for terminology applications.
To ensure continuity and stability, codes shall only be changed for
compelling reasons. After a change in codes, the previous code shall
not be reassigned for at least five years.To accommodate large applications
that build continuously, the codes in ISO 639-2/B shall not be changed
if a language name or its abbreviation are changed.
When adapting this part of ISO 639 to languages using other alphabets
(e. g. Cyrillic), codes shall be formed according to the principles of
this part of ISO 639.
4.1.1 Collective language codes
Collective language codes are provided for languages where a relatively
small number of documents exist or are expected to be written, recorded
or created. The words languages or (other) as part
of a language name in the following tables may be taken to indicate that
a language code is a collective language code. A collective language
code is not intended to be used when an individual language code or another
more specific collective language code is available.
4.1.2 Special situations
The language code mul (for multiple languages) should
be applied when several languages are used and it is not practical to
specify all the appropriate language codes.
The language code und (for undetermined) is provided
for those situations in which a language or languages must be indicated
but the language cannot be identified.
4.1.3 Scripts and dialects
A single language code is normally provided for a language even though
the language is written in more than one script. A separate standard
may be developed for the purpose of designating information concerning
the script or writing system of a language.
The dialect of a language is usually represented by the same language
code as that used for the language. If the language is assigned to a
collective language code, the dialect is assigned to the same collective
language code. If the language has an individual language code, the dialect
is also included in that code rather than the code for the group to which
both belong. In a few instances, however, both the language and a dialect
of that language have their own individual language codes.
4.1.4 Local Codes
Codes qaa through qtz are reserved for local use, including for local
treatment of dialects. These codes may only be used locally, and may
not be exchanged internationally.
4.1.5 Ancient languages
Ancient languages that are not given individual language codes are assigned
the code for the major language collective to which each belongs, rather
than the code for the modern language which evolved from the ancient
language. For example, Old Frisian is assigned the language
code gem for the language group Germanic (Other) instead
of the language code fry for the modern language Frisian.
In cases of doubt, assistance may be sought from the Registration Authority.
4.2 Registration of new language codes
The Registration Authority for this part of ISO 639 shall be the Library
of Congress, Washington, D.C. 20540-4102 USA (c/o Network Development
and MARC Standards Office)*1. The Registration
Authority for ISO 639-1 is Infoterm, Simmeringer Hauptstrasse 24, A-1110
4.3 Application of language codes
Language codes can be used in the following specific instances. Examples
of how language codes may be used follow each.
To indicate the languages in which documents are or have been written
UNIMARC Format *2, Field 101 Language of the
A pamphlet is issued in Danish (language code dan used for Danish)
Field 101: 0#$adan
To indicate the languages in which document-handling records (order
records, bibliographic records, and the like) have been created.
UNIMARC Format *2, Field 100 Language of cataloguing;
A book is written in Dutch, but the catalog record is in English
(language code eng used for English)
Field 100 positions 22-24: eng
To indicate the language-speaking capabilities of delegates to a meeting.
(Alternatively, another international standard list may be used, for
example ISO 639-1.
EXAMPLE - In a list of delegates issued at an ISO meeting,
the codes eng, fre and rus indicate whether
the delegates spoke in (eng) English, (fre) French, or (rus) Russian.
To indicate the original language of a document.
EXAMPLE - Document of the United Nations:
ST/DCS/1/Rev. 2 eng fre
[Document is bilingual.]
4.4 Application of the country code
Country codes from ISO 3166 may be combined with language codes to denote
the area in which a term, phrase, or language is used.
- a spool of thread (eng US)
- a bobbin of cotton (eng GB)
5 Structure of the list of language codes
The list of language codes is presented in three tables:
Table 1: Alpha-3 codes arranged alphabetically
by English name of language
Table 2: Alpha-3 codes arranged alphabetically
by French name of language
Table 3: Alpha-3 codes arranged alphabetically
by ISO 639-2 code
*1 Subject to approval of ISO Council.
*2 The UNIversal MAchine Readable Cataloging format is
used for exchange of bibliographic data.
Top of Document
Comments on this document: email@example.com