Eurasiatic languages

Eurasiatic is a hypothetical language macrofamily proposal that would include many language families historically spoken in northern, western, and southern Eurasia. The theory argues that most languages spoken in Northern Eurasia descent from a common language spoken around the end of the ice age.

The idea of a Eurasiatic superfamily dates back more than 100 years. Joseph Greenberg's proposal, dating to the 1990s, is the most widely discussed version. In 2013, Mark Pagel and three colleagues published what they believe to be statistical evidence for a Eurasiatic language family.

The branches of Eurasiatic vary between proposals, but typically include the highly controversial Altaic macrofamily (composed in part of Mongolic, Tungusic and Turkic), Chukchi-Kamchatkan, Eskimo–Aleut, Indo-European, and Uralic—although Greenberg uses the controversial Uralic-Yukaghir classification instead. Other branches sometimes included are the Kartvelian and Dravidian families, as proposed by Pagel et al., in addition to the language isolates Nivkh, Etruscan and Greenberg's "Korean–Japanese–Ainu". Some proposals group Eurasiatic with even larger macrofamilies, such as Nostratic; again, many other professional linguists regard the methods used as invalid.

It is often contested if sufficient phonetic traces can survive the large time depths at which long-range proposals such as Eurasiatic are situated, although proponents maintain that small traces can still be recovered. The hypothesis has fallen out of favour and has limited degrees of acceptance, predominantly among a minority of Russian linguists. Linguists worldwide reject Eurasiatic and many other macrofamily hypotheses.

History of the concept

In 1994, Merritt Ruhlen asserted that a Eurasiatic language could be supported by a specific grammatical pattern involving distinct suffixes for plural and dual noun forms, which does not appear in languages outside the proposed Eurasiatic superfamily. However, the pattern itself had been observed long before, as Rasmus Rask had already described it in 1818 within the Uralic and Eskimo–Aleut groups.

In 1998, Joseph Greenberg extended his work in mass comparison, a methodology he first proposed in the 1950s to categorize the languages of Africa, to suggest a Eurasiatic language. In 2000, he expanded his argument for Eurasiatic into a full-length book, Indo-European and Its Closest Relatives: The Eurasiatic Language Family, in which he outlines both phonetic and grammatical evidence that he feels demonstrate the validity of language family. The heart of his argument is 72 morphological features that he judges as common across the various language families he examines. Of the many variant proposals, Greenberg's has attracted the most academic attention.

Stefan Georg and Alexander Vovin, who, unlike many of their colleagues, do not stipulate a priori that attempts to find ancient relationships are bound to fail, examined Greenberg's claims in detail. They state that Greenberg's morphological arguments are the correct approach to determining families, but doubt his conclusions. They write "[Greenberg's] 72 morphemes look like massive evidence in favour of Eurasiatic at first glance. If valid, few linguists would have the right to doubt that a point has been made  [...] However, closer inspection  [...] shows too many misinterpretations, errors and wrong analyses  [...] these allow no other judgement than that [Greenberg's] attempt to demonstrate the validity of his Eurasiatic has failed."

In the 1980s, Russian linguist 's hypothesis () linked the Indo-European, Uralic, Turkic, Mongolic, and Tungusic language families, including Koreanic in his later papers. Andreev also proposed 203 lexical roots for his hypothesized Boreal macrofamily. After Andreev's death in 1997, the Boreal hypothesis was further expanded by Sorin Paliga (2003, 2007).

Pagel et al.

In 2013, Mark Pagel, Quentin D. Atkinson, Andreea S. Calude, and Andrew Meade published statistical evidence that attempts to overcome these objections. According to their earlier work, most words exhibit a "half-life" of between 2,000 and 4,000 years, consistent with existing theories of linguistic replacement. However, they also identified some words – numerals, pronouns, and certain adverbs – that exhibit a much slower rate of replacement with half-lives of 10,000 to 20,000 or more years. Drawing from research in a diverse group of modern languages, the authors were able to show the same slow replacement rates for key words regardless of current pronunciation. They conclude that a stable core of largely unchanging words is a common feature of all human discourse, and model replacement as inversely proportional to usage frequency.

Words were separated into groupings based on how many language families appeared to be cognate for the word. Among the 188 words, cognate groups ranged from 1 (no cognates) to 7 (all languages cognate) with a mean of 2.3 ± 1.1. The distribution of cognate class size was positively skewed − many more small groups than large ones − as predicted by their hypothesis of variant decay rates.

Twenty-three word meanings had cognate class sizes of four or more. The authors write "Our ability to predict these words independently of their sound

correspondences dilutes the usual criticisms leveled at such long-range linguistic reconstructions, that proto-words are unreliable or

inaccurate, or that apparent phonetic similarities among them reflect chance sound resemblances." On the first point, they argue that inaccurate reconstructions should weaken, not enhance, the signals. On the second, they argue that chance resemblances should be equally common across all word usage frequencies, in contrast to what the data shows.

The team then created a Markov chain Monte Carlo simulation to estimate and date the phylogenetic trees of the seven language families under examination. Five separate runs produced the same (unrooted) tree, with three sets of language families: an eastern grouping of Altaic, Inuit–Yupik, and Chukchi–Kamchatkan; a central and southern Asia grouping of Kartvelian and Dravidian; and a northern and western European grouping of Indo-European and Uralic. The first roots the tree to the midpoint of the branch leading to proto-Dravidian and yields an estimated origin for Eurasiatic of 14450 ± 1750 years ago. The second roots the tree to the proto-Kartvelian branch and yields 15610 ± 2290 years ago. Internal nodes have less certainty, but exceed chance expectations, and do not affect the top-level age estimate. The authors conclude "All inferred ages must be treated with caution but our estimates are consistent with proposals linking the near concomitant spread of the language families that comprise this group to the retreat of glaciers in Eurasia at the end of the last ice age ~15,000 years ago." Writing on University of Pennsylvania blog Language Log, Sarah Thomason questions the accuracy of the LWED data on which the paper was based. She notes that LWED lists multiple possible proto-word reconstructions for most words, increasing the possibility of chance matches. Pagel et al. anticipated this criticism and state that since infrequently used words generally have more proposed reconstructions, such errors should "produce a bias in the opposite direction" of what the statistics actually show (i.e. that infrequently used words should have larger cognate groups if chance alone was the source). Thomason also argues that since the LWED is contributed to primarily by believers in Nostratic, a proposed superfamily even broader than Eurasiatic, the data is likely to be biased towards proto-words that can be judged cognate.

Pagel et al. also examined two other possible objections to their conclusions. They rule out linguistic borrowing as a significant factor in the results on the basis that for a word to appear cognate in many language families solely because of borrowing would require frequent swapping back and forth. This is deemed unlikely because of the large geographical area covered by the language groups and because frequently-used words are the least likely to be borrowed in modern times.

Classification

According to Greenberg, the language family that Eurasiatic is most closely connected to is Amerind. He states that "the Eurasiatic-Amerind family represents a relatively recent expansion (circa 15,000 years ago) into territory opened up by the melting of the Arctic ice cap". In contrast, "Eurasiatic-Amerind stands apart from the other families of the Old World, among which the differences are much greater and represent deeper chronological groupings". Like Eurasiatic, Amerind is not a generally accepted proposal.

Eurasiatic and another proposed macrofamily, Nostratic, often include many of the same language families. Vladislav Illich-Svitych's Nostratic dictionary did not include the smaller Siberian language families listed in Eurasiatic, but this was only because protolanguages had not been reconstructed for them; Nostraticists have not attempted to exclude these languages from Nostratic. Many Nostratic theorists have accepted Eurasiatic as a subgroup within Nostratic alongside Afroasiatic, Kartvelian, and Dravidian. LWED likewise views Eurasiatic as a subfamily of Nostratic.

Subdivisions

thumb|upright=1.8|Eurasiatic family tree in order of first attestation

The subdivisioning of Eurasiatic varies by proposal, but usually includes Turkic, Tungusic, Mongolic, Chukchi-Kamchatkan, Eskimo–Aleut, Indo-European, and Uralic.

Greenberg enumerates eight branches of Eurasiatic, as follows: Altaic [Turkic, Mongolic, Tungusic], Chukchi-Kamchatkan, Eskimo–Aleut, Etruscan, Indo-European, "Korean-Japanese-Ainu", Nivkh, and Uralic–Yukaghir. He then breaks these families into smaller sub-groups, some of which are themselves not widely accepted as phylogenetic groupings.

Pagel et al. use a slightly different branching, listing seven language families: Altaic [Turkic, Mongolic, Tungusic], Chukchi-Kamchatkan, Dravidian, "Inuit-Yupik"—which is a name giving to LWED grouping of Inuit (Eskimo) languages that does not include Aleut —Indo-European, Kartvelian, and Uralic.

Regardless of version, these lists cover the languages spoken in most of Europe, Central and Northern Asia and (in the case of Eskimo-Aleut) on either side of the Bering Strait.

The branching of Eurasiatic is roughly (following Greenberg):

Eurasiatic
Indo-European (unity undisputed)
Uralic–Yukaghir (hypothetical)
Uralic (unity undisputed)
Yukaghir (unity undisputed)
Nivkh (unity undisputed)
Chukotko-Kamchatkan (unity undisputed)
Eskaleut (unity undisputed)
Altaic (controversial)
Turkic (unity undisputed)
Mongolic (unity undisputed)
Tungusic (unity undisputed)
Korean–Japanese–Ainu (hypothetical)
Koreanic (unity undisputed)
Japonic (unity undisputed)
Ainu (unity undisputed)
Tyrsenian (grouping of three closely related extinct languages; their affiliation with Eurasiatic, based primarily on "mi" first person singular, is highly speculative given lack of attestation)

Jäger (2015)

A computational phylogenetic analysis by Jäger (2015) provided the following phylogeny of language families in Eurasia:

<!-- This section is misleading, these roots are proposed as Eurasiatic roots that can be linked to proto-world. They are not proposed as diagnostically Eurasiatic. This section would need expansion and clarification, otherwise it is misleading.

Roots

Ruhlen presents the following roots for Eurasiatic: kʷi (who?), mi (what?), pälä (two), akʷā (water), tik (one or finger), konV (arm 1), bhāghu(s) (arm 2), bük(ä) (bend or knee), punče (hair), p'ut'V (vagina or vulva), snā (smell or nose), kamu (seize or squeeze), and parV (the verb to fly). Other such proposed words include the Indo-European *tu 'you', hypothized proto-Altaic *ti 'you' and turi 'you' in proto-Chukchi-Kamchatkan.

In 1994 Merritt Ruhlen claimed Eurasiatic is supported by the existence of a grammatical pattern "whereby plurals of nouns are formed by suffixing -t to the noun root ... whereas duals of nouns are formed by suffixing -k." Rasmus Rask noted this grammatical pattern in the groups now called Uralic and Eskimo–Aleut as early as 1818, but it can also be found in Tungusic, Nivkh (also called Gilyak) and Chukchi–Kamchatkan—all of which Greenberg placed in Eurasiatic. According to Ruhlen, this pattern is not found in language families or languages outside Eurasiatic.

The existence of a Dené–Caucasian family is disputed or rejected by most linguists, including Lyle Campbell, Ives Goddard, and Larry Trask.

The last common ancestor of Eurasiatic was estimated by phylogenetic analysis of ultraconserved words at roughly 15,000 years old, suggesting that these languages spread from a "refuge" area at the Last Glacial Maximum.