| ISO_639 {ISOcodes} | R Documentation |
ISO 639 Language Codes
Description
International Organization for Standardization (ISO) codes for the representation of languages. Consists of four parts, with more parts work in progress. ISO 639-1 consists of 185 two-letter (alpha-2) codes used to identify the world's major languages. ISO 639-2 has three-letter (alpha-3) codes for 485 languages. ISO 639-3 extends the ISO 639-2 alpha-3 codes with an aim to cover all known natural languages. ISO 639-5 defines alpha-3 codes for language families.
Usage
ISO_639_2
ISO_639_3
ISO_639_3_Retirements
ISO_639_5
Format
ISO_639_2 is a character data frame with variables
Alpha_3_B and Alpha_3_T (the ISO 639-2 bibliographic and
terminological codes), Alpha_2 (the corresponding ISO 639-1
alpha-2 code if available), and Name (the English name of the
language).
ISO_639_3 is a data frame with the following variables:
Id:a character vector with the ISO 639-3 3-letter (alpha-3) identifiers.
Part2B:a character vector with the equivalent ISO 639-2 B-code identifiers of the bibliographic applications code set (if existent).
Part2T:a character vector with the equivalent ISO 639-2 T-code identifiers of the terminology applications code set (if existent).
Part1:a character vector with the equivalent ISO 639-1 2-letter (alpha-2) identifiers (if existent).
Scope:a factor with levels
"I"(Individual),"M"(Macrolanguage) and"S"(Special).Type:a factor with levels
"L"(Living languages),"E"(Extinct languages),"A"(Ancient languages),"H"(Historic languages),"C"(Constructed languages), and"S"(Special).Name:a character vector with the reference language names.
Comment:a character vector with a comment relating to one or more of the other variables.
Family:a character vector with the generic English names of the languages' family or macrolanguage.
eng:a character vector with the language names in English.
fra:a character vector with the language names in French (if available).
spa:a character vector with the language names in Spanish (if available).
zho:a character vector with the language names in Chinese (if available).
rus:a character vector with the language names in Russian (if available).
deu:a character vector with the language names in German (if available).
Variables Family and eng to deu are extracted
from the Wikipedia ISO 639-3 language codes pages.
ISO_639_3_Retirements is a data frame giving the languages
retired from ISO 639-3, with variables:
Id:a character vector with the retired codes
Ret_Reason:a factor with levels
"C"(change),"D"(duplicate),"N"(non-existent),"S"(split), and"M"(merge).Change_To:a character vector which in the cases of C, D, and M gives the identifier to which all instances of the Id should be changed.
Ret_Remedy:a character vector with instructions for updating an instance of the retired (split) identifier.
Effective:a
Dateobject giving the date the retirement became effective.
ISO_639_5 is a data frame with the following variables:
Ida character vector with the 3-letter (alpha-3) ISO 639-5 identifiers.
English_Namethe family names in English.
French_Namethe family names in French.
Part2a factor indicating how the family relates to 639-2, with levels
"g"(group: consists of several related languages),"r"(rest group: a group of several related languages, from which some specific languages have been excluded), or""(no 639-2 code).Hierarchyan indication of which other language families or groups the current language family or group is a member of (given as 639-5 ids separated by ‘ : ’).
Details
While most languages are given one code by the ISO 639-2 standard, twenty-two of the languages described have two three-letter codes, a “bibliographic” code (ISO 639-2/B, B-code), which is derived from the English name for the language and was a necessary legacy feature, and a “terminological” code (ISO 639-2/T, T-code), which is derived from the native name for the language. The range ‘qaa’ to ‘qtz’ is reserved for local use.
ISO 639-3 is a superset of ISO 639-1 and of the individual languages in ISO 639-2. ISO 639-1 and ISO 639-2 focused on major languages, most frequently represented in the total body of the world's literature. Since ISO 639-2 also includes language collections, whereas Part 3 does not, ISO 639-3 is not a superset of ISO 639-2. Where B and T codes exist in ISO 639-2, ISO 639-3 uses the T-codes.
ISO 639-2 contains codes for some individual and group languages and so any code in it is either in 639-3 or 639-5; 639-5 families may be missing from 639-2.
Source
https://www.loc.gov/standards/iso639-2/ for ISO 639-2;
https://iso639-3.sil.org/code_tables/download_tables for ISO 639-3;
https://www.loc.gov/standards/iso639-5/ for ISO 639-5.
References
https://en.wikipedia.org/wiki/ISO_639