Overview of World Languages 2016
Overview of World Languages 2016
net/publication/305993508
Languages of the world (August 2016 draft for Oxford Research Encyclopedia of
Linguistics)
CITATIONS READS
2 3,468
1 author:
W R Leben
Stanford University
44 PUBLICATIONS 946 CITATIONS
SEE PROFILE
All content following this page was uploaded by W R Leben on 08 August 2016.
Summary
About 7,000 languages are spoken around the world today. The actual number depends on where
the line is drawn between language and dialect—an arbitrary decision because languages are
always in flux. But specialists applying a reasonably uniform criterion across the globe count
well over 2,000 languages in Asia and Africa, while Europe has just shy of 300. In between are
the Pacific region, with over 1,300 languages, and the Americas, with just over 1,000.
Languages spoken natively by over a million speakers number around 250, but the vast majority
have very few speakers. Something like half are thought likely to disappear over the next few
decades, as speakers of endangered languages turn to more widely spoken ones.
The languages of the world are grouped into 430 language families, based on their origin, as
determined by comparing similarities among languages and deducing how they evolved from
earlier ones. As with languages, there’s quite a lot of disagreement about the number of language
families, reflecting our inability to know for sure how closely one group of languages is related
to another, due to our meager knowledge of many present-day languages and even sparser
knowledge of their history. The figure 430 comes from [Link], which actually lists them
all. While the world’s language families may well go back to a smaller number of original
languages, even to a single mother tongue, scholars disagree on how far back current methods
permit us to trace the history of languages.
While it is normal for languages to borrow from other languages, occasionally a totally new
language is created by mixing elements of two distinct languages to such a degree that we would
not want to identify one of the source languages as the mother tongue. This is what led to the
development of Media Lengua, a language of Ecuador formed through contact among speakers
of Spanish and speakers of Quechua. In this language practically all the word stems are from
Spanish, while all of the endings are from Quechua. Just a handful of languages have come into
being in this way, but less extreme forms of language mixture has resulted in over a hundred
pidgins and creoles currently spoken in many parts of the world. Most arose during Europe’s
colonial era, when European colonists used their language to communicate with local
inhabitants, who in turn blended vocabulary from the European language with grammar largely
from their native language.
Also among the languages of the world are about 300 sign languages used mainly in
communicating with the deaf. The structure of sign languages typically has little historical
connection to the structure of nearby spoken languages.
Some languages have been constructed expressly, often by a single individual, to meet
communication demands among speakers with no common language. Esperanto, designed to
serve as a universal language and used as a second language by some two million, according to
2
some estimates, is the prime example, but it is only one among several hundred would-be
international auxiliary languages.
This essay surveys the languages of the world continent by continent, ending with descriptions of
sign languages and of pidgins and creoles. Each section ends with a set of references. The main
source for data on language classification, numbers of languages, and speakers is the nineteenth
edition of Ethnologue ([Link] except where a different source is cited.
1. Europe
1.1 Indo-European
Most of Europe’s languages belong to the Indo-European family, which has the following
branches: Celtic, Germanic, Italic, Greek, Albanian, Balto-Slavic, Armenian, Indo-Iranian,
Anatolian, and Tocharian.
1.1.1 Celtic
Celtic, which extended across much of Europe as far east as present-day Turkey 2,000 years ago,
has undergone gradual contraction since the ascendance of the Romans in Europe, and with the
spread of English and French the Celtic languages are now confined to parts of Britain, Ireland,
and western France. The two main branches of modern Celtic are Brythonic and Goidelic. In the
Brythonic branch are Welsh, Cornish, and Breton; the Goidelic branch includes Irish, Scottish
Gaelic, and Manx.
Gaulish, a third branch, went extinct but has recently undergone restoration attempts, as have
Manx and Cornish, which also were extinct. In fact, all present-day Celtic languages have seem
revitalization efforts. This is happening even with Welsh—hardly an endangered language with
562,000 speakers in the 2011 census. Currently, Wales has school programs aimed at getting a
greater proportion of ethnic Welsh, who number nearly 2,400,000, to learn to speak the
language. The same is happening with Breton, spoken by over 200,000 in Brittany in
northwestern France, but “no longer exclusively predominately, or even commonly used by the
population in any city, town, or village in Brittany,” according to Adkins (2013). As in Wales,
school programs in Brittany since at least the 1970’s have aimed to get young people speaking a
variety of their ethnic tongue.
1.1.2 Germanic
Germanic’s two branches, North and West, were once grouped into a superbranch called
Northwest Germanic, once paired with the Gothic branch that went extinct, largely in the
Middle Ages, though isolated traces of Crimean Gothic remained until the late eighteenth
century. The North Germanic languages are Swedish, Danish, Norwegian, Icelandic, and
Faroese. West Germanic includes English, German, and Dutch. Each of these is paired with
a sister language that is also spoken by significant numbers. English is paired with Western
Frisian, Dutch with Afrkaans, and German with Yiddish.
3
1.1.3 Italic
This is the ancestral branch of the modern Romance languages, all descended from a
colloquial form of Latin. About 2500 years ago, the Italic branch included not just Latin but
also Oscan, Umbrian, and Faliscan, but these languages have no modern descendants. The
modern descendants of Latin include French, Catalan, Spanish, Portuguese, Italian, Romanian,
Sardinian, Romansch, Ladin, Friulian, Occitan, and Judeo-Spanish.
1.1.5 Balto-Slavic
This group has Baltic and Slavic subbranches. The official languages of Baltic countries
Lithuania and Latvia make up the Baltic subbranch. Slavic has three divisions: Eastern
(Russian, Ukrainian, and Belarusian), Southern (Serbo-Croatian, Macedonian, Slovenian,
and Bulgarian), and Western (Polish, Czech, Slovak, and Sorbian).
1.1.6. Indo-Iranian
The languages of this branch are spoken in Asia. See section 3.1.
1.1.7. Armenian
Like Greek and Albanian, the Armenian branch has just one language, with a major division
between Eastern and Western dialects. The standard language of Armenia is in the Eastern
Armenian group, which also includes the dialects of Armenian communities in Iran, Russia,
Georgia, and their environs. Texts from Armenian Cilicia from the eleventh to the
fourteenth centuries CE are the first to show a differentiated Western dialect. Many dialects
of Western Armenian were obliterated by the Armenian genocide, but the Western
Armenian standard and its dialects are found in Turkey (especially Istanbul), the Levant, and
émigré communities in the West. Armenian is of special interest to linguists because of
retentions from Indo-European, notably all seven of its noun cases and the irregular
retention of initial laryngeals.
1.2 Uralic
Three important languages in this family are Finnish, Estonian, and Hungarian. These three
were once grouped into a branch called Finno-Ugric. But while Finnish and Estonian are
closely related members of the Finnic branch of Uralic, Hungarian is a separate Uralic entry,
4
and Finno-Ugric is no longer regarded as a branch. The remaining languages of Uralic are
small languages spoken in northern parts of Europe and Asia. One branch, Sami, has ten
small languages, each one a variant of Saami. By far the largest is Lule Saami (pejoratively
called Lapp), with close to 2,000 speakers, mostly in Sweden.
1.4 Basque
Basque is an isolate spoken in the Western Pyrenees by about a million, some in France but
most in Spain. Its history is widely thought to go back several millennia, antedating the more
recent Indo-European migrations to the region. There have attempts to identify Basque with a
wide variety of groups, including Kartvelian, Afro-Asiatic, and Iberian, but without attracting
much support. Recent DNA evidence reinforces the notion of Basque descent from an ancient
population of farmers and hunters (Günther et al. 2015).
1.5 Turkish
Turkish, a language of Europe and Asia, belongs to the Turkic group, described in the
section on Asia.
1.6 References
Adkins, M. (2013). Will the real Breton please stand up? Language revitalization and the problem
of authentic language. International Journal of the Sociology of Language 2016(223): 55-70.
Günther, T., Valdiosera, C., Malmström, H., Ureña, I., Rodriguez-Varela, R., Sverrisdóttir, Ó. O.,
& de Castro, J. M. B. (2015). Ancient genomes link early farmers from Atapuerca in Spain to
modern-day Basques. Proceedings of the National Academy of Sciences, 112(38), 11917-11922.
5
2. Africa
Africa’s extraordinary linguistic diversity is threatened by the possible extinction of half or more
of its languages, as some predict by the end of the century due to competition from other
languages. The current count exceeds 2,000 languages, grouped into just a few families.
The most revolutionary aspects of Greenberg’s (1955, 1963) classification of African language
families largely stand today, though with many adjustments by later experts in the different
languages. Many other questions still remain open. For example, Greenberg recognized Khoisan
as a family, but today’s scholars tend to set a higher bar for establishing genetic relationships,
leading many to defer judgment on whether this truly is a family. The unity of Nilo-Saharan is
occasionally called into question, but the detailed comparative work of Bender 1996-7) and Ehret
(2001) has gained this family wide acceptance as a valid genetic unit. Niger-Congo and Afro-
Asiatic remain uncontested as genetic units.
For Afro-Asiatic, there are debates about subgrouping. For example, do Semitic, Berber, and
Cushitic together form a separate branch, as Bender 1997 contends? Within Cushitic,
Greenberg’s classification included Omotic, which many now regard as a distinct branch, while
Glottolog fails to recognize Omotic as an established group at all. Within Niger-Congo, there are
a number of unanswered questions, many revolving around the constituency of its most complex
branch, Benue-Congo, which uncontroversially includes all the Bantu languages and many more.
Among the changes, the Kwa languages are now reduced to what Greenberg called Western
Kwa, and the remaining languages have been moved from Greenberg’s Kwa into distinct
branches, including Yoruboid as a major example. Ijoid is quite possibly not in Benue-Congo,
where Greenberg placed it, but is instead a sister branch. For details and references, see Bendor-
Samuel & Hartell (1989) and the references in Nordhoff et al. (2013).
2.1 Afro-Asiatic
This is the northernmost family, with 376 languages spanning all of North Africa and the Middle
East, as well as two smaller areas of sub-Saharan Africa. The six branches of Afro-Asiatic are
Semitic, Berber, Chadic, Cushitic, Omotic, and Egyptian. The Semitic branch has seventy-eight
languages, including Arabic, the first language of up to 300 million throughout North Africa and
widely spoken in the Middle East. Among the world’s languages, Arabic ranks fourth in the
number of speakers. Other important Semitic languages are Hebrew, which shares official status
in Israel with Arabic, and several Ethiopic languages. Amharic, the official language of Ethiopia
and the first language of 21 million, is a South Ethiopic language. In the North Ethiopic branch is
Tigrigna, an official language of Eritrea spoken by 7 million.
The term Afro-Asiatic was used by Joseph Greenberg to replace the designation Hamito-Semitic,
which posited a division between the Semitic branch (named for Biblical figure Shem) and a
putative branch named for Biblical figure Ham. The notion that Hamitic languages formed a
unified branch seemingly reflected factors like speakers’ typical occupations and a lighter skin
color than black Africans to the south. Greenberg argued that extraneous factors like these had
no place in language classification, which should be based solely on linguistic data. Comparing
languages from the different groups classed as Hamitic, Greenberg concluded that the evidence
did not support their grouping into a single branch.
6
The Berber branch of Afro-Asiatic is spoken in the foothills of the Atlas Mountain in Morocco
and Algeria and, spottily, in neighboring countries. Cushitic gets its name from Cush, the son of
Ham. The forty-five languages of this group are spoken mainly in Ethiopia and Somalia, with a
few in Kenya and Tanzania. Chadic languages are mainly spoken in the countries surrounding
Lake Chad and are dominant in northern Nigeria, numbering close to 200 in all. By far the most
widely spoken is Hausa, with twenty-five million native speakers. The thirty-one languages of
the Omotic branch are all spoken in southwestern Ethiopia. The Egyptian branch, thanks to
hieroglyphs, can be traced back before 3,000 B.C. Ancient Egyptian was the ancestor of Coptic,
spoken in Egypt but over time was replaced by Arabic until Coptic died out, roughly 400 years
ago. Since then Coptic has survived as a liturgical language.
2.2 Nilo-Saharan
The 205 languages of the Nilo-Saharan family occupy a band extending from the Sahara desert
to the Nile region. For a relatively small family, they are quite diverse typologically, and as
already noted, some doubt whether the Nilotic and Saharan branches really deserve to be
grouped into a family. Reflecting this, Glottolog divides them into two separate families, Nilotic
and Saharan.
2.3 Niger-Congo
The great majority of languages in sub-Saharan languages are members of the Niger-Congo
family. Its 1,538 languages make it the world’s largest language family, and only the Indo-
European and Sino-Tibetan language families have more speakers than Niger-Congo. Ideas
about the respective genetic affiliations of well-known groups within Niger-Congo have changed
substantially over the last half-century. This has been the case with Kwa, Mande, Gur, Atlantic,
and Benue-Congo, among others. To date, the truly remarkable event in the classification of this
family remains Greenberg’s (1955. 1963) demonstration that Bantu—a group of 538 languages
covering most of Central and Southern Africa, was, along with other languages called Bantoid, a
subgroup within a group now called East Benue-Congo, most of whose other languages are
spoken in Nigeria and Cameroon. This discovery—which took ten years before gaining the wide
acceptance it has today—not only challenged earlier assumptions about linguistic classification
but also open the door to hypotheses about Bantu origins. The currently accepted view is that
Bantu originated in southeastern Nigeria and expanded east and south from there.
2.4 Khoisan
Among the languages of the world, some are poorly studied and go back so far in time that it is
hard to trace their genetic origins. This is the case with Khoisan, which is generally not
recognized as an established family but as a set of twenty-seven languages—some with just a
handful of speakers—that are likely not to belong to the other three established families of
African languages. Ermisch (2008) presents what is known, along with the residual problems.
2.5 Austronesian
Off the southeastern coast of Africa is the island of Madagascar, home to Malagasy, a Malayo-
Polynesian language brought over by the island’s earliest settlers over 2,000 years ago. For more
on Malayo-Polynesian, see the section on Austronesian in the section on Oceania.
7
2.6 References
Bender, M. L. (1996-7). Nilo-Saharan languages: An essay in classification. (Lincom
Handbooks in Linguistics). Munich: Lincom Europa.
Bendor-Samuel, J. T., & Hartell, R. L. (1989). The Niger-Congo languages: A classification and
description of Africa's largest language family. University Press of America.
Ermisch, S. (Ed.) (2008). Khoisan languages and linguistics: Proceedings of the 2nd
International Symposium, January 8–12, 2006. Cologne: Rüdiger Köppe.
Greenberg, J. H. (1955). Studies in African language classification. New Haven, CT: Compass.
Nordhoff, S., Hammarström, H., Forkel, R., & Haspelmath, M. (Eds.) (2013). Benue–
Congo. Glottolog. Jena: Max Planck Institute for the Science of Human History. (Accessed
online at [Link] on 8/4/20164.)
3. Asia
Asia is home to 60% of the world’s population and nearly 30% of the world’s languages. These
group into just a handful of major families, leaving out several important isolates, and due to
long periods of contact, there’s less diversity than one might expect. The downside is that the
contact situation has made it difficult to classify genetic relationships with certainty in some
important cases. And it’s worth mentioning some areal features for various subregions:
3.1 Indo-Iranian
Indo-Iranian is not a family but a branch of Indo-European, whose other branches were
listed in the section on Europe. Among Indo-Iranian languages, Hindi and Urdu are official
languages of India & Pakistan, respectively, and many consider them dialects of a single
language. Kachru’s 2008 linguistic sketch describes Hindi and Urdu as closely related,
mentioning the special case of Hindustani, an essentially colloquial language that has been called
a co-dialect of Hindi and Urdu. Hindustani is the language once promoted by Gandhi and the
Indian National Congress as a tool of national unity. For the Hindustani controversy, see
Kachru 2008).
3.2 Turkic
The forty-one languages of this family extend from Macedonia to Siberia, Central Asia, and
western China. Despite the vastness of this area, the languages themselves are typologically
quite similar: agglutinative, with vowel harmony involving both backness and rounding.
8
3.3 Mongolic
The thirteen Mongolic languages are spoken in Mongolia and in adjacent areas of the
Russian Federation and China. Mongolian, with over six million speakers, is by far the
largest language in the family and the official language both of Mongolia and of the Inner
Mongolian Autonomous Region of China.
3.4 Tungusic
The eleven languages of this family are scattered through Siberia, the Far East of Russia,
and northwestern, but most are endangered and some are nearly extinct. That includes
Manchu, the language of the founders of the Qing Dynasty, which ruled China for nearly
three centuries up to 1912. The 2016 edition of Ethnologue lists only twenty speakers for
Manchu, though over ten million are ethnically Manchu.
The Altaic hypothesis has inspired a good deal of writing. Here is some relevant work on
various sides of the question:
Georg, S., Michalove, P. A., Manaster Ramer, A, & Sidwell, J. (1999). Telling general
linguists about Altaic. Journal of Linguistics 35: 65–98.
Greenberg, J. H. (2002). Indo-European and its closest relatives: The Eurasiatic language
family, vol. 2: Lexicon. Stanford: Stanford University Press.
Greenberg, J. H. (2000). Indo-European and its closest relatives: The Eurasiatic language
family, vol. 1: Grammar. Stanford: Stanford University Press.
Greenberg, J. H. (1997). Does Altaic exist? In: I. Hegedus, P.A. Michalove, & A. Manaster
Ramer (Eds.), Indo-European, Nostratic and beyond: A festschrift for Vitaly V. Shevoroshkin,
(pp. 88-93). Washington, DC: Institute for the Study of Man. Reprinted in J. H. Greenberg
(2005). Genetic Linguistics. Oxford: Oxford University Press (pp. 325–330).
Miller, R. A. (1991). Genetic connections among the Altaic languages. In: S. M. Lamb and E.
D. Mitchell (Eds.), Sprung from some common source: Investigations into the prehistory of
languages. Stanford: Stanford University Press.
Starostin, S. A., Dybo Anna V., & Mudrak, O. A. (2003). Etymological dictionary of the
Altaic languages, 3 volumes. Leiden: Brill.
Unger, J. M. 1990). Summary report of the Altaic panel. In: P. Baldi (Ed.) Linguistic change
and reconstruction methodology. Berlin: Mouton de Gruyter.
3.6 Dravidian
Dravidian languages are spoken primarily in southern India, though some are also found
further north in the Indian subcontinent. The major literary languages are Tamil, Malayalam,
Kannada, and Telugu, each one the first language of tens of millions. We are able to trace
the history of Dravidian better than for many other language families due to a long literary
periods of Tamil, Malayalam, Kannada, and Telugu.
Questions have been raised about Dravidian similarities to Uralic and Altaic, among several
others. Austerlitz (1971) dismissed these, and Krishnamurti (2003), briefly surveying
archeological and DNA literature along with linguistic evidence in his foundational work on
Dravidian, seconds the conclusion that the linguistic arguments behind the proposed genetic
relationships are tenuous and speculative.
For Dravidian morphology and word order, Krishnamurti’s (2003: 6) brief statement
captures the basic pattern:
… mostly agglutinative in type, but without the elaborate chains of affixes found in, say,
Turkish. Some languages have also developed a number of fusional traits. The word order
is relatively fixed, usually SOV. Some Dravidian languages exhibit a three-way contrast
in coronal stops: dental, alveolar, and retroflex. Dravidian is the likely source of the
retroflex consonants of Sanskrit.
3.7 Sino-Tibetan
The languages of this family are spoken in China, the Himalayas, and Burma. The division
into Chinese and Tibeto-Burman branches is customary, as espoused by Matisoff (2003), though
a few experts, including van Driem (2007), still question the grouping of Sinitic as a separate
sister branch to Tibeto-Burman, along with many particulars. Tibeto-Burman, with 441
languages, is especially problematic because of the inaccessibility of many of its languages in the
Himalayas, not to mention that van Driem (2015: 141) finds them “endangered with imminent
10
extinction.” Overall, the lower-level groupings within Tibeto-Burman are more certain than the
higher-level ones, leading van Driem (2001) to posit a “Fallen Leaves” model that recognizes
clumps of closely related languages without identifying where on the family tree they fell from.
Still, Ethnologue offers a full family tree. Sino-Tibetan was at one time thought to include
languages farther south, such as the Tai-Kadai languages and the Hmong-Mien (Miao-Yao)
languages, but the similarities among these languages are probably better attributed to areal
diffusion, including massive lexical borrowing from Chinese.
3.7.1 Chinese
Member languages of the Chinese (or Sinitic) branch are sometimes called dialects,
especially in China, but this stretches the normal meaning of the term “dialect” too far, since
the fourteen languages that make up Chinese are far from mutually intelligible, even though
they share the same writing system and many grammatical properties. Each of the fourteen
Chinese languages of course has dialects. Ethnologue lists five major dialects for Mandarin
(which also goes by the name Guanhua): Huabei Guanhua (Northern Mandarin), Xibei
Guanhua (Northwestern Mandarin), Xinan Guanhua (Southwestern Mandarin), Jinghuai
Guanhua (Eastern Mandarin, Jiangxia Guanhua (Lower Yangtze Mandarin). Other sources
divide the dialects differently, due not only to differences of linguistic and geographical criteria
but also to centuries of diffusion of linguistic features. For discussion, see Kurpaska (2010) and
Yan (2006). With over a billion speakers total, Mandarin’s dialects have many subdialects as
well.
Linguistic diffusion is the general pattern in the historical development of Chinese, due to over a
dozen massive population movements going back to the seventh century BCE and continuing to
the present, each migration involving hundreds of thousands and often millions of people.
Complicating these scenarios is the fact that in most cases, the migrations were to areas already
settled by speakers of Chinese or other languages, often resulting in language mixture. The
history of these migrations and their linguistic effects is traced by LaPolla (2001).
3.7.2 Tibeto-Burman
As already noted, most of the 441 languages of this branch are endangered. As a group, they
have many linguistic traits in common, including SOV order and agglutinative verb structure.
Two word order exceptions are the Karenic languages (Myanmar) and Bai (China), which have
the SOV order characteristic of Sinitic, though unlike Sinitic, Karen and Bai are also relatively
agglutinative. Karen and Bai both stand out enough from the rest of Tibeto-Burman to inspire
attempts to classify them outside of Tibeto-Burman proper. Benedict’s (1976) proposed sister to
Sinitic, labeled Tibeto-Karenic, with Tibeto-Burman as a daughter, has been ruled out, while
more recently several scholars have taken up the case for linking Bai with Sinitic. See Wang
(2005) for a brief survey with references.
3.8 Austro-Asiatic
The Austro-Asiatic family extends across south Asia from India to Vietnam. The Munda
branch is found in Northeastern India, surrounded by Indo-European and Dravidian
languages which have influenced them greatly over the ages. Typologically they are
agglutinative, with SOV word order. The Munda languages are typologically very different
from the other major branch of Austro-Asiatic, Mon-Khmer, which includes two important
11
Ket, an isolate in Central Siberia with 210 speakers, is unlike the rest of Paleosiberian in several
respects. It is tonal and has a highly agglutinative verbal system with complex agreement patterns—
features that make it look like Na-Dene in North America. The case for a genetic relationship
between the two has been made by Vajda (2010, 2011). For arguments pro and con, see Kari &
Potter (2010), Campbell (2011) and Kiparsky (2014: 65-67). Implications of this finding for
Beringian migrations are pursued by Sicoli & Holton (2014).
3.12 References
Austerlitz, R. (1971). Long-range comparisons of Tamil and Dravidian with other
language. In: R. E. Asher (Ed.), Proceedings of the Second International Conference-Seminar
of Tamil Studies. Vol. 2. (pp. 254-61). Madras: Association of Tamil Research.
Benedict, P. K. (1976). Sino-Tibetan: Another look. Journal of the American Oriental Society
96(2): 167–197.
12
Campbell, Lyle. (2011). Review of The Dene-Yeniseian connection (Kari and Potter).
International Journal of American Linguistics 77: 445-451).
Kiparsky, P. (2014). New perspectives in historical linguistics. In: C. Bowern & B. Evans
(Eds.), The Routledge handbook of historical linguistics (pp. 64-102). New York: Routledge.
Kurpaska, M. (2010). Chinese language(s): A look through the prism of the great dictionary of
modern Chinese dialects. Vol. 215. Berlin: de Gruyter.
LaPolla, R. J. (2001). The role of migration and language contact in the development of the Sino-
Tibetan language family. In: A. Y. Aikhenvald & R. M. W. Dixon (Eds.), Areal diffusion and
genetic inheritance: Case studies in language change (pp. 225-254). Oxford: Oxford University
Press.
Matisoff, J. A. (1991). Sino-Tibetan linguistics: Present state and future prospects. Annual
Review of Anthropology 20: 469–504.
Rai, A. (1984). A house divided: The origin and development of Hindi/Hindavi. New Delhi:
Oxford University Press.
Sicoli, M. A., & Holton, G. (2014) Linguistic phylogenies support back-migration from Beringia
to Asia." PoS ONE 9.3: e91722.
Thurgood, G, & LaPolla, R. J. (Eds.). (2003). The Sino-Tibetan languages. London: Routledge.
Ting Pang-Hsin [Ding Bangxin], & Hóngkai Sun. (2000). Hàn-Zàngyu yánjiu de lìshi huígù
[Retrospective history of Sino-Tibetan studies]. Hàn- Zàngyu tóngyuáncí yánjiu, 1. [Cognate
words in Sino-Tibetan languages], 1). Nanning: Guangxi Mínzú Chubanshè [Guangxi
Nationalities Press].
13
Vajda, E. J. (2010). Siberian Link with Na-Dene Languages. In: J. Kari & B. Potter (Eds.).
Anthropological Papers of the University of Alaska. (New series, Special issue) 5.1: 33-99.
van Driem, G. (2001). Languages of the Himalayas: An ethnolinguistic handbook of the Greater
Himalayan region, containing an introduction to the Symbiotic Theory of Language (2 vols.).
Leiden: Brill.
van Driem, G. (2015). Tibeto-Burman. In W. S-Y. Wang & C. Sun (Eds.) The Oxford handbook
of Chinese linguistics (pp. 135-148). New York: Oxford University Press.
Wang, Feng. (2005). On the genetic position of the Bai language. Cahiers de Linguistique Asie
Orientale, 34(1): 101-127.
4. Oceania
Oceania, which includes Australia and most of the island territories of the central and southern
Pacific and Indian oceans, is home to the Austronesian family and to two very large language
groups, the Australian and the Papuan groups.
4.1 Austronesian
The 1250+ languages of this family are distributed across Oceania from Madagascar to the
Pacific Islands and total well over 350 million speakers. All but twenty-five of these languages
are Malayo-Polynesian; the rest are aboriginal languages of Taiwan.
The dominant category, Central-Eastern Malayo-Polynesian, has well over half of the languages
classified as Malayo-Polynesian but only a few million speakers total and is not generally
accepted as a valid linguistic grouping. The remaining Malayo-Polynesian languages are found
in seventeen smaller groups, some of whose languages are widely spoken and highly important
politically. Among these are:
Blust (2013) offers a recent and comprehensive account of the linguistic and anthropological
aspects of this family, including internal linguistic groupings, the linguistic structure of its
languages, sociolinguistic considerations, and archeological evidence backing up the linguistic
groupings. Adelaar & Himmelmann (2005) cover a similar range of topics.
4.3 Australia
This continent has been inhabited for 50,000 years, but the time frame for language classification
is limited to just the last 5,000 or so. As a result, we know very little about the historical
connections among Australia’s languages. Worse, the number of vigorous Aboriginal languages
today is a fraction of what it was before Europeans settled there in the eighteenth century. Of the
250-odd languages of Australia in 1788, more than half are extinct, and of the remainder, fewer
than two dozen are used and learned by the youngest generation.
Beginning with Hale (1966), many sources divide the continent’s original languages into two
groups, Pama-Nynngan and Non-Pama-Nynngan, but even this rudimentary grouping is
complicated by large-scale phonological and grammatical diffusion. Dixon, author of many
standard reference works on Australian languages, among them Dixon (2002), diverges from the
others by simply dividing the languages into fifty groups representing different areas, though
among them some genetic clusters may be found. For Dixon, Pama-Nyungan “cannot be
supported as a genetic group. Nor is it a useful typological grouping.” (Dixon 2002: 53). The
problem with applying standard methods toward reconstructing a language tree for Australia, as
Dixon sees it, is that Australia is unique, in part to due widespread diffusion, whereby a language
“will tend to become more like its neighbors” (Dixon 2002: 448). For alternative studies from a
vantage point that differs markedly, see Bowern & Koch (2004).
4.4 References
Adelaar, K. A., & Himmelmann, N. (2005). The Austronesian languages of Asia and
Madagascar. Routledge Language Family Series. New York: Routledge.
Bowern, C., & H. Koch. (Eds.) (2004). Australian languages: Classification and the
comparative method. Amsterdam: John Benjamins.
Foley, W. A. (1986). The Papuan languages of New Guinea. Cambridge University Press.
Foley, W. A. (2000). The languages of New Guinea. Annual review of Anthropology 29: 357-
404.
Hale, K. L. (1966). The Paman group of the Pama-Nyungan phylic family. Appendix to G. N.
O’Grady, C. F. Voegelin, & F. M. Voegelin, Languages of the world: Indo-Pacific. Fascicle
6. Anthropological Linguistics 8.2: 162–197.
Mushin, I., & Baker, B. (Eds.) (2008). Discourse and grammar in Australian languages. Vol.
104. Amsterdam: Benjamins.
Ross, Malcolm. (2005). Pronouns as a preliminary diagnostic for grouping Papuan languages. In
A. Pawley, R. Attenborough, & R. Hide, & J. Golson (Eds.), Papuan pasts: Cultural, linguistic
and biological histories of Papuan-speaking peoples (pp. 15–65). Canberra: Pacific Linguistics.
5. The Americas
The past and present states of indigenous languages in the Americas are entirely different as a
result of colonization by Europeans. North America is estimated to have been host at one time to
nearly 300 distinct languages (Mithun 1999: 1). Since then, over a hundred have gone extinct,
and practically all of the rest are endangered. The 2010 U.S. Census Bureau report found 169
Native North American languages to be spoken in the home, with a total speaking population of
less than half a million. By far the largest is Navajo, with nearly 170,000. In second and third
place with roughly 19,000 each are Yupik and Dakota.
16
Central and South America are home to a few much larger languages, spoken by several million.
Still, language endangerment is also the rule there. Of perhaps 1,700 pre-Columbian languages,
fewer than 700 remain (Campbell 1997) and of these most are spoken by populations of several
thousand or fewer.
The languages of the Americas are often divided into three geographical areas: North America,
Mexico and Central America, and South America. Greenberg’s (1987 classification grouped the
languages into three “super-families” that he called Eskimo-Aleut, Na-Dene, and Amerind. Of
these, the most controversial is Amerind, though there is some physical evidence for this
grouping (Cavalli-Sforza). But the grouping has been widely contested, for reasons summarized
by Campbell (2012: 19), referring to Paul Rivet, who worked on a classification of South
American languages in the first half of the twentieth century: “Greenberg’s subgroups have been
met with skepticism for a number of reasons, including the underanalyzed nature of the
presented data, the perpetuation of old misunderstandings [ …], and the fact that recent findings
may suggest entirely different groupings.”
5.1.1 Eskimo-Aleut
The Aleut branch has just one language, variously called Aleut or Unangax̂ and spoken by 155 in
the Aleutian and Pribilof islands (Alaska) and the Commander Islands (Siberia). Eskimo has two
branches, Inuit and Yupik. Because the term Eskimo is deemed offensive by many, especially in
Canada and Greenland, Yupik-Inuit is sometimes used instead.
5.1.2 Na-Dene
The name Na-Dene is absent from the latest Ethnologue listing, having been replaced by Eyak-
Athabaskan. At one time Na-Dene was thought to include Haida (Sapir 1915), but this view has
been abandoned by most (Schoonmaker 1997).
Navajo belongs to the Apachean group of Athabaskan, a branch of Na-Dene with forty-two
languages widely distributed across the western U.S. and western Canada. Its morphology is
interesting because of a complex prefix system that might lead it to be classified as agglutinative,
were it not for complex, overlapping dependencies that are more characteristic of fusional
languages. Like many Athabaskan languages, Navajo is tonal, yet proto-Athabaskan lacked tone,
and tone seems to have developed independently in many Athabaskan languages from
constricted vowels (Campbell 1997: 113).
17
5.1.3 Algic
This family has forty-two languages, all but two in the Algonquian branch, distributed across a
wide expanse of Eastern Canada and the northeastern United States.
5.1.4 Wakashan
Wakashan, a family of seven languages in British Columbia, was assigned by Edward Sapir (in a
1929 Encyclopedia Brittanica entry) to a putative stock called Mosan that also included the
Salishan family (below). Sapir’s conjecture was based on a long list of shared grammatical
similarities. But Beck 2000, echoing Campbell 1997, finds little lexical similarity and concludes
that that one is dealing with a Sprachbund (Thomason & Kaufman 1992), a set of languages
whose common features have arisen from contact rather than from shared genetic origins.
5.1.5 Salishan
The twenty-six languages of this family are spoken in the coastal regions and in the region
immediately to the east in British Columbia and in nearby areas in the U.S. One of typological
distinctions of Salishan languages is an extremely rich set of consonant contrasts—up to six
pharyngeal consonants, contrasting velars and uvulars, and a full set of ejectives.
5.1.6 Utian
Approximately a dozen languages in the Utian family of central and northern California are
divided into two branches, Miwok and Costanoan.
5.1.7 Plateau
Also known as Plateau Penutian, this group of four languages in the Pacific Northwest includes
Klamath and Nez Percé.
5.1.8 Cochimi-Yuman
Also called Yuman, this group of eight small language plus extinct Cochimi is spoken in Arizona
and neighboring parts of California and Mexico.
5.1.9 Uto-Aztecan
Sixty-one languages make up this family. The thirteen languages of the Northern branch are
spoken in the western United States. Among them is Hopi, spoken by 6,700 in and around
northeastern Arizona. The Southern branch has forty-eight languages, almost all of them in
Mexico.
5.1.10 Kiowa-Tanoan
Speakers of the five languages making up this family live in the southwestern U.S.
5.1.11 Siouan-Catawba
This family, also called Siouan, includes Catawba, a language of South Carolina, which lost its
last native speaker in the twentieth century but is being revived as a second language by ethnic
Catawbas. Total speakers for the Siouan family are under 35,000, but among its fourteen
languages is Dakota, spoken in North and South Dakota and neighboring areas. As noted earlier,
after Navajo, Dakota is the third largest indigenous language of North America and nearly tied
for second place with Yupik, with close to 19,000 speakers.
18
5.1.12 Caddoan
This group of five languages, each with just a handful of speakers, may possibly form a super-
family with Iroquoian and Siouan, based on comparative work (Chafe 1976), but the relationship
is not considered established (Mithun 1999: 305).
5.1.13 Muskogean
Traces of this family of six languages, roughly estimated at around 150,000 speakers, are still
found in the southeastern U.S., but forced relocations by the U.S. government in the 1830’s
drove many Muskogean tribes from their homeland. Included were the Choctaw and Chickasaw
Nations, now situated in Oklahoma.
5.2.1 Uto-Aztecan
The Southern branch of this family includes twenty-eight varieties of Nahuatl in Mexico and
one in El Salvador that altogether number 1.5 million according to the 2010 census. Nahuatl
traces its origins to the Aztecs who dominated the area for many centuries.
5.2.2 Mayan
The thirty-one languages comprising Mayan are spoken mainly in Guatemala and Mexico, also
in Belize and Honduras. Estimates of the number of speakers of Mayan languages run to six
million, with well over half that number in Guatemala. The most important Mayan languages of
Guatemala are K’iche’, with 2,330,000 speakers, Q’eqchi’ with 800,000, Mam with 530,000, and
Kaqchikel with 451,000. In Mexico, Yucatec Maya is spoken by 736,000, and a few others are
spoken by well over a hundred thousand. The languages are still centered around the original
Maya homeland in Guatemala and on the Yucatan peninsula.
Among the noteworthy achievements of early Maya civilization were temples, pyramids, and
the only writing system developed in the Americas before the coming of the European
explorers. Decipherment of the writing system has offered a direct glimpse into the Mayan
protolanguage and makes a fascinating story, recounted by Coe 1999).
5.2.3 Otomanguean
This is a large family of 177 languages spoken in central and southern Mexico. In the Eastern
Otomanguean branch are the Mixtecan languages, including Trique and fifty-two varieties of
Mixtec listed in Ethnologue, and sixty-three Zapotecan languages, including Chatino and fifty-
seven varieties of Zapotec listed in Ethnologue. Recent census estimates for both Mixtec and
Zapotec are in the area of 500,000 speakers. The Western Otomanguean branch numbers thirty-
19
seven languages, among them fourteen distinct varieties of Chinantec and nine varieties of
Otomi. The 2010 census gives 130,000 native speakers for Chinantec and 290,000 for Otomi.
5.2.4 Totonacan
This is a family of twelve small languages spoken in and around Puebla State in Mexico. The
largest is Sierra.
5.2.5 Mixe-Zoquean
This family groups the ten Mixean languages with the seven Zoquean languages. All are
spoken on the narrow strip of Southern Mexico between the Gulf of Mexico and the Pacific
Ocean.
Among the 108 language families Campbell (2012) finds in South America, larger groupings still
remain to be firmly established. Of the hypotheses advanced to date, including Greenberg’s
(1987) classification that puts them all in Amerind, none have been proved to general
satisfaction.
5.3.2 Arawakan
The family with the greatest geographical reach, spreading from Honduras down to Bolivia and
as far east as Suriname, is Arawakan, with forty languages not including about two dozen extinct
ones. Some reserve the name Arawakan for a slightly larger group with eleven additional
languages, but their genetic connection to the core family is unproven (Campbell 2012: 71). For
this reason Campbell uses Arawakan (which includes the language Arawak for the core group
that also goes by the names Maipurean and Maipuran, as listed in Ethnologue.
5.3.3 Arawan
The Arawan family of western Brazil, with six languages, and Guajiboan, with five languages in
Eastern Colombia and southwestern Venezuela, comprise the group of eleven mentioned above
sometimes classed with Arawakan.
5.3.4 Cariban
Cariban is a family of thirty-one languages (as well as around two dozen extinct ones) in Brazil
and Venezuela as well as in Guyana, Suriname, and Colombia. Most have just a few hundred
speakers; some have a few thousand. The largest is Macushi, with 18,000 speakers in Brazil.
5.3.5 Tucanoan
Yet another family is Tucanoan, with twenty-five languages in Colombia, Ecuador, Peru, and
Brazil. A few are extinct or very severely endangered. The two largest, with just over 6,000
speakers each, are Cubeo (Colombia) and Tucano (Brazil).
5.3.6 Aymaran
Aymaran has just two languages. One of them is Aymara, spoken by a million in Bolivia and
several hundred thousand in Peru.
5.3.7 Quechuan
Quechuan languages are spoken natively by a greater number than any other language family
indigenous to the Americas, a result of the spread of the Inca Empire in pre-Columbian times.
The total speaking population is 8.5 million, mainly in Peru, Ecuador, and Bolivia. The
designations of all but two of the forty-four Quechuan languages include the name Quechua
along with a geographical identifier, reflecting a close relationship, though in most cases not
mutual intelligibility. Most are small, with a few thousand speakers. About a dozen others range
from the tens of thousands to around 100,000, and a few more are spoken by several hundred
thousand. Larger than these are South Bolivian Quechua (1,600,000 speakers in Bolivia),
Ayacucho Quechua (900,000 speakers in Peru, including Lima), and Chimborazo Highland
Quichua (800,000 in Ecuador). All three belong to what is known as Peripheral Quechua, a
sister branch to Central Quechua. These two branches constitute the major break in the
Quechuan family. Quechua is, along with Spanish, the official language in Peru.
Phonological, structural, and lexical similarities between Quechua and Aymara have raised the
possibility that the two are related, as discussed by Orr & Longacre 1968 and Kaufman (2007),
but Adelaar (1992, 2012) argues instead that the many must have resulted from intense contact
predating the protolanguages along with subsequent diffusion. Part of the reasoning is that the
lexical similarities are in fact too similar where they occur and extend to only about a quarter of
the vocabulary, while the rest is highly different.
5.3.8 Tupian
Jensen & Grimes (2003), Kaufman (2007), and Rodrigues & Cabral (2014) regard the Tupian
languages of Central Amazonia as a language stock—a grouping of languages families not fully
established but thought to be distantly related. Here we list it as an established family, following
Kaufman (1990), Campbell (2012), and Ethnologue.
21
This set of seventy-six languages is grouped into eleven small branches and isolates and one
major branch, Tupi-Guarani, which some recognize as a family in and of itself (Michael et al.
2015). Its fifty-one languages are found in parts of Paraguay, Brazil, and Bolivia but once
covered a much larger expanse of South America, from the eastern coast to the west and
from northern Argentina up to French Guiana. Ten languages of this group are varieties of
Guaraní that together are spoken by five million, principally in Paraguay, where it is an
official language (along with Spanish) and is widely used as a second language as well.
Beyond what is presented here, Campbell (2012) discusses many plausible and possible genetic
relationships within South America. Campbell & Grondona (2012 29) cite a dozen other works
on this topic.
5.4 References
Adelaar, W. (1992). Quechuan Languages. In W. Bright (Ed.), Oxford international
encyclopedia of linguistics 3 (pp. 303–10). Oxford: Oxford University Press.
Asher, R. E., & Moseley, C. (2007). (Eds.) Atlas of the world's languages. 2nd edn. London:
Routledge.
Beck, D. (2000). Grammatical convergence and the genesis of diversity in the Northwest Coast
Sprachbund. Anthropological Linguistics 42(2), 147-213.
Campbell, L. (1997). American Indian languages: The historical linguistics of native America.
New
York: Oxford University Press.
Campbell, L. (2012). Classification of the indigenous languages of South America. In: Campbell
& Grondona 2012, 59–166.
Campbell, L. & Grondona, V. (Eds.) (2012). The indigenous languages of South America: A
comprehensive guide. Berlin: Mouton de Gruyter. Accessed online
8/4/2016. [Link]
Coe, M. D. (1999). Breaking the Maya code. Rev. edn. London: Thames and Hudson.
22
Chafe, W. L. (1976). Siouan, Iroquoian, and Caddoan. In: T. A. Sebeok (Ed.), Native languages
of the Americas (pp. 527-572). New York: Springer US
Fabre, A. (1998). Manual de las lenguas indígenas sudamericanas, Vol. 1. Lincom Europa.
Golla, V., Goddard, I, Campbell, L., Mithun, M., Mixco, M. (2007). North America. In: R. E.
Asher & C. Mosley (pp. 5-44).
Jensen, C. J. & Grimes, B. (2003). International encyclopedia of Linguistics. 2nd edn. Oxford:
Oxford University Press.
Kaufman, T. (1990). Language history in South America: What we know and how to know
more. In D. L. Payne (Ed.), Amazonian linguistics: Studies in lowland South American
languages (pp. 13–67). Austin: University of Texas Press.
Kaufman, T., with help from B. Berlin. (2007). South America. In: R. E. Asher & C. Mosley (pp.
59-93).
Mithun, M. (1999). The languages of native North America. Cambridge: Cambridge University
Press.
Schoonmaker, P. K., Von Hagen, B., & Wolf, E. C. (1997). The rain forests of home: Profile of a
North American bioregion. Island Press.
Sapir, E. (1915). The Na-Dene languages: A preliminary report. American Anthropologist 17(3):
534–558.
Thomason, S. G., & Kaufman, T. (1992). Language contact, creolization, and genetic linguistics.
University of California Press.
23
U.S. Census Bureau. (2011). Native North American languages spoken at home in the United
States and Puerto Rico: 2006-2010. American Community Survey Briefs. Accessed online
8/4/2016: [Link]
6. Sign languages
As with spoken languages, we’re unable to trace back to the time when the first sign languages
were used. Still, McBurney (2012) documents early reports on signing by the deaf, including an
Ancient Egyptian text from around 1200 BCE: “Thou are one who is deaf and does not hear, to
whom men make (signs) with the hand.” From Plato’s Cratylus she quotes, “… should we not,
like the deaf and dumb, make signs with the hands and head and the rest of the body?” And
from a collection on Jewish oral law from the late second century CE: “A deaf-mute may
communicate by signs and be communicated with by signs.”
Signing systems developed into languages as communities of users grew and the communicative
needs of the deaf were recognized by governments, educators, and the general public. In parts of
Europe, emerging deaf communities were developing sign languages well before the eighteenth
century, and in 1817 Thomas Gallaudet established the first permanent deaf school in the U.S.,
basing his methods on practices already in place in France in Britain.
Ethnologue lists 138 sign languages for the deaf, each one named for the location where it is
used. Many are adaptions of signing systems already used in other regions, as illustrated by
American Sign Language (ASL), which Thomas Gallaudet directly based on French Sign
Language. ASL has [Link] most widely used sign language of the deaf, with 250,000 users
in North American, the Caribbean, the Philippines, and Africa. ASL and other sign languages are
not closely connected to the spoken languages of the regions where they are used. For example,
British Sign Language and American Sign Language are not mutually intelligible.
Sign languages also develop in response to other needs. A famous case is Plains Indian Sign
Language, once used as a lingua franca by Native Americans over a vast expanse of North
America and still in use in some regions (Davis 2010). Sign languages that have arisen in
Aboriginal Australia in response to speech taboos and ritual observance have been described by
Kendon (1988).
6.1 References
Davis, J. E. (2010). Hand talk: Sign language among American Indian nations. Cambridge:
Cambridge University Press.
McBurney, S. (2012). History of sign languages and sign language linguistics. In R. Pfau, M.
Steinbach, & E. Woll (Eds.), Sign language: An international handbook, (pp. 909–948). Berlin:
de Gruyter.
24
7.1 Pidgins are simplified languages that arise out of a need to communicate among speakers
lacking a common language, typically in colonial situations where one group is dominant.
Members of the dominated group fuse grammatical features, often simplified, of their native
language (called the substrate) with vocabulary from the dominant, or superstrate, language. The
resulting language serves restricted purposes, such as trade.
There are not many pidgins. Ethnologue lists only sixteen, six of them in Africa and five in
Oceania, if Indonesia is included. Hin Motu, an official language of Papua New Guinea, is
noteworthy because it goes against some typical views of pidgins. This language developed
between the Motu and their trading partners nearby before any European contact. After
colonization, its use spread, though the colonizers themselves had little if any knowledge of it.
More usual are the cases of the original Chinese Pidgin English, once known as Pigeon English,
which arose in seventeenth-century China for trade with the British, and Nigerian Pidgin, which
developed in the same era, again due to trade contact with the British, notably the slave trade.
Hin Motu, Chinese Pidgin English, and Nigerian Pidgin illustrate three different types of
situation. Hin Motu and Chinese Pidgin English exemplify pidgins that originate when trade
partners are equal (Hin Motu) or unequal (Chinese Pidgin English). The two had similar
outcomes, eventually fading away—Hin Motu in favor of Tok Pisin, a widely spoken creole of
New Guinea, and Chinese Pidgin English in favor of Standard English, which came to be
commonly taught in schools. (Since then, a different language called Chinese Pidgin English has
arisen on the Pacific island of Nauru, for communicating with Chinese-speaking merchants and
traders.) By contrast, Chinese Pidgin English and Nigerian Pidgin had analogous origins (for
communicating with traders in a dominant position), yet different outcomes, since the first has
died out, while the second has vastly expanded its uses and its speaking population. Currently
Nigerian Pidgin is learned by many children at an early age for communication with peers in
virtually any informal situation.
7.2 This takes us to creoles, which are first languages of members of speech communities but
originate from types of language contact resembling if not always identical to situations that give
rise to pidgins. Being acquired as a first language gives creoles a stability that pidgins lack, and
so it is not surprising that many more creoles are in current use—ninety-three listed in
Ethnologue —than pidgins. Thirty-two creoles are spoken around Latin America and the
Caribbean, twenty-six in Oceania, and twenty-two in Africa. Like pidgins, creoles have a
substrate and a superstrate. English is the superstrate for thirty-three creoles, Malay for fourteen,
Portuguese for thirteen, and French for eleven.
Probably the most vigorously debated topic in current pidgin and creole studies is how creoles
form and evolve. Bickerton (1981, 1988) interpreted creolization in terms of what is known as
the bioprogram hypothesis. This would see creoles as developing from a pidgin that learners
were exposed to at an early age. The hypothesis was that acquisition is guided by an innate
bioprogram that supplies structure to complement and modify the pidgin’s substrate and
superstrate. This idea excited those who saw its potential to shed light on the human language
faculty in general. At the same time, among creolists, the bioprogram hypothesis gave rise to a
25
literature that almost universally sought to disprove it. Viewed more positively, it engendered
lots of new thinking on how creoles come about.
Veenstra (2008) surveys some of the progress made during this period. Early commenters found
reason to assign a greater role to the superstrate language than would be the case under
Bickerton’s hypothesis, which leaned heavily on universal grammar. Another criticism cited the
fact that some creoles develop without having a pidgin as a source. Bickerton’s explanation,
relying on acquisition by a generation of speakers with no other first language, implied that a
creole would always develop in a single generation, yet this has been falsified by Nicaraguan
Sign Language, which took two generations (Kegl, Senghas, & Coppola 1999). For many more
counterproposals and refinements, see DeGraff (1999), Mufwene (1996), and Singler (1996).
One area of agreement is that neither pidgins nor creoles are homogeneous types, as earlier work
seemed to assume. There are many varieties, as is found with the rest of the languages covered in
this essay.
7.3 References
Bickerton, D. (1981). Roots of language. New York: Karoma Publishers.
Bickerton, D. (1988). Creole languages and the bioprogram. In: F. J. Newmeyer (Ed.), Linguistics:
The Cambridge survey 2, Cambridge: Cambridge University Press.
DeGraff, M. (Ed.) (1999). Language change and creation: Creolization, diachrony, and
development. Cambridge: MIT Press.
Kegl, J., Senghas, A., & Coppola, M. (1999). Creation through contact: Sign language
emergence and sign language change in Nicaragua. In: DeGraff (pp. 179–238).
Mufwene, S. (1996). The founder principle in creole genesis. Diachronica 13: 83–134.
Veenstra, T. (2008). Creole genesis: The impact of the language bioprogram hypothesis. In: S.
Kouwenberg & J. V. Singler (Eds.), The handbook of pidgin and creole studies (pp. 219-241).
Malden: Wiley-Blackwell.
8. Resources
An online database of scholarly hypotheses about possible language families and their
membership is Multitree ([Link] A pronouncing dictionary of selected words from
over world languages is at Forvo ([Link] Audio pronunciations for over 100,000
words are available for some languages, down to around 500 for others; the pronunciations are
collected from users of the site.
Austin, P. K. (Ed.). (2008). One thousand languages: Living, endangered, and lost. Berkeley:
University of California Press.
Campbell, G. L., & King, G. (2011). The Routledge concise compendium of the world’s
languages, 2nd edn. New York: Routledge.
Comrie, B. (2001). Languages of the world. In M. Aronoff, Mark & J. Rees-Miller (Eds.). The
handbook of linguistics (pp. 19–42). Malden: Blackwell.
Comrie, B. (Ed.). (2009). The world’s major languages. 2nd edn. New York: Routledge.
Lyovin, A., Kessler B., & Leben, W. R. (2016). Introduction to the languages of the world. New
York: Oxford University Press.