An unclassified language is a language whose genetic affiliation to other languages has not been established. Languages can be unclassified for a variety of reasons, mostly due to a lack of reliable data but sometimes due to the confounding influence of language contact, if different layers of its vocabulary or morphology point in different directions and it is not clear which represents the ancestral form of the language. Some poorly known extinct languages, such as Gutian, are simply unclassifiable, and it is unlikely the situation will ever change.
A supposedly unclassified language may turn out not to be a language at all, or even a distinct dialect, but merely a family, tribal or village name, or an alternative name for a people or language that is classified.
If a language's genetic relationship has not been established after significant documentation of the language and comparison with other languages and families, as in the case of Basque in Europe, it is considered a language isolate – that is, it is classified as a language family of its own. An 'unclassified' language therefore is one which may still turn out to belong to an established family once better data is available or more thorough comparative research is done. Extinct unclassified languages for which little evidence has been preserved are likely to remain in limbo indefinitely, unless lost documents or a surviving speaker population is discovered.
Classification challenges
An example of a language that has caused multiple problems for classification is Mimi of Decorse in Chad. This language is only attested in a single list of words collected ca. 1900. At first it was thought to be a Maban language, because of similarities to Maba, the first Maban language to be described. However, as other languages of the Maban family were described, it became clear that the similarities were solely with Maba itself, and the relationship was too distant for Mimi to be related specifically to Maba and not equally to the other Maban languages. The obvious similarities are therefore now thought to be due to borrowings from Maba, which is the socially dominant language in the area. When such loans are discounted, there is much less data to classify Mimi with, and what does remain is not particularly similar to any other language or language family.
- Guale and Yamasee (US)
- Himarimã (Brazil) – a living presumed language of an uncontacted people; assumed to be Arawan
- Nagarchal (India) – assumed to have been Dravidian
- Kwisi (Angola)
- Ancient Cappadocian (Asia Minor) – possibly Anatolian
- Lycaonian (Asia Minor) – possibly Anatolian
- Zapotec (Jalisco) (Mexico)
- Otomi (Jalisco) (Mexico)
- Moksela (Indonesia) – possibly one of the Central Maluku languages
- Gomba (Ethiopia)
- Palumata (Maluku) – perhaps a dialect of Hukumina
- Giyug (Australia) – possibly Wagaydyic
- Karranga (Australia) – likely Pama–Nyungan
- Yugul (Australia) – likely Marran
- Aguano (Peru) – may be Arawakan
- Chachapoya (Peru) – known only from possible toponyms and family names, not any actual source of vocabulary
- Alagüilac (Guatemala) – may be related to Xinca
- Avoyel (Louisiana)
- Flecheiro (Brazil) – assumed to be Katukinan
- Janambre (Mexico)
- Jumanos (Texas and New Mexico)
- Majena (Bolivia)
- Moneton (West Virginia) – likely Siouan
- Opelousa (Louisiana)
- Pedee (South Carolina) – possibly Siouan
- Tremembé (Brazil)
- Truká (Brazil)
- Wakoná (Brazil)
- Wasu (Brazil)
Scarcity of data
Many unclassified languages are also considered unclassifiable due to the presence of some, but not enough, data to reveal close language relatives. For others there may be enough data to show the language belongs to a particular family, but not where within it, or to show the language has no close relatives, but not enough to conclude that it is a language isolate.
- Solano (Mexico) – possibly a language isolate
- Quimbaya (Colombia), if it existed; only 10 known words
- Nam (Chinese–Tibetan border) – data remains undeciphered; probably Sino-Tibetan
- Kujargé (Chad) – possibly Afroasiatic
- Quingnam (Peru) – known only from a list of unlabeled numerals
- Bung (Cameroon) – most likely Niger–Congo
- Luo (Cameroon)
- Komta (Nigeria)
- Wawu (Ghana or possibly the Ivory Coast)
- Kambojan (South Asia and Central Asia)
- (West Africa)
- Dima (Ethiopia)
- Philistine (Southern Levant) – perhaps either Afroasiatic or Indo-European
- Iberian (Spain and southern France)
- Minoan (ancient Crete) – existence attested through inscriptions in the undeciphered Linear A script. Attempts to link Minoan to the nearby Afroasiatic and Indo-European language families have been inconclusive.
- Eteocretan (ancient Crete)
- Hattic (Anatolia) – probably a language isolate
- Kaskian (Anatolia) – possibly related to Hattic
- Kassite (Iraq) – possibly Hurro-Urartian
- Gutian (Zagros borderlands)
- Hunnic (Eastern Europe and Central Asia) – no known written language. Minimal attested vocabulary outside of given names. As the Hunnic Empire was multi-ethnic, a majority of denizens were non-Hunnic Indo-European and Turkic-speaking tribes.
- Xiongnu (Mongolia) – possibly Para-Yeniseian, Turkic or an isolate
- Tuoba (China) – possibly Para-Mongolic or an isolate
- Rouran (Mongolia) – possibly Para-Mongolic or an isolate
- Beothuk (Newfoundland) – assumed to have been related to Algonquian languages
- Meroitic (Sudan) – possibly Nilo-Saharan or Afroasiatic
- Guanahatabey (Cuba) – known only from toponyms
- Macorix (Dominican Republic and possibly Haiti)
- Pankararú (Brazil)
- Bagua (Peru)
- Chirino (Peru) – probably related to Candoshi-Shapra
- Copallín (Peru)
- Ramanos (Bolivia)
- Tartessian (southwest Iberian Peninsula)
- Ligurian (ancient) (Liguria) – probably Indo-European
- Rutulian (central Italy)
- Elymian (western Sicily) – likely Indo-European
- Sicanian (central Sicily)
- Eteocypriot (Cyprus)
- Tambora (Indonesia) – possibly a language isolate
- Karami (Papua New Guinea)
- Makolkol (New Britain)
- Ambermo (Indonesia)
- Xocó (Brazil) – not clear if it was a single language
Unrelated to nearby languages and not commonly examined
- Bangime (Mali)
- Jalaa (Nigeria)
- Kwaza (Brazil)
- Mpre (Ghana)
Basic vocabulary unrelated to other languages
- Bayot (Senegal)
- Laal (Chad)
Not closely related to other languages and no academic consensus
- Ongota (Ethiopia)
- Shabo (Ethiopia)
- Omaio (Tanzania)
- Kenaboi (Malaysia)
Languages of dubious existence
- Oropom (Uganda) (extinct, if it existed)
- Imeraguen (Mauritania) (Hassaniya Arabic variety with Berber words for fishing)
- Nemadi (Mauritania)
- Rer Bare (Ethiopia) (extinct, if it existed)
- Wutana (Nigeria) (extinct, if it existed)
- Trojan (Anatolia) (extinct and as yet unattested, if it existed; possibly a Luwian dialect or related language)
- North Picene (Italy) (extinct, if it existed; attested in inscriptions that have been accused of being fabricated)
Some 'languages' turn out to be fabricated, such as the Kukurá language of Brazil.
See also
- :Category:Unclassified languages
- List of unclassified languages according to the Ethnologue
- List of unclassified languages of North America
- List of unclassified languages of South America
- Language isolate
- List of language families (including isolates and unclassified languages)
Notes
References
External links
- Ethnologue: Unclassified languages
