Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
…
36 pages
1 file
This paper presents an internal classification of Tupí-Guaraní based on a Bayesian phylogenetic analysis of lexical data from 30 Tupí-Guaraní languages and 2 non-Tupí-Guaraní Tupian languages, Awetí and Mawé. A Bayesian phylogenetic analysis using a generalized binary cognate gain and loss model was carried out on a character table based on the binary coding of cognate sets, which were formed with attention to semantic shift. The classification shows greater internal structure than previous ones, but is congruent with them in several ways.
LIAMES: Línguas Indígenas Americanas
This paper presents an internal classification of Tupí-Guaraní based on lexical data from 30 Tupí-Guaraní languages and 2 non-Tupí-Guaraní Tupian languages, Awetí and Mawé. A Bayesian phylogenetic analysis using a generalized binary cognate gain and loss model was carried out on a character table based on the binary coding of cognate sets, which were formed with attention to semantic shift. The classification shows greater internal structure than previous ones, but is congruent with them in several ways.
2020
Files from the phylogenetic analysis and commented presentation for Linguistweets (https://www.linguistweets.org/programa/h/6).
Tupí-Guaraní is one of the largest branches of the Tupían language family, but despite its relevance there is no consensus about its origins in terms of age, homeland, and expansion. Linguistic classifications vary significantly, with archaeological studies suggesting incompatible date ranges while ethnographic literature confirms the close similarities as a result of continuous inter-family contact. To investigate this issue, we use a linguistic database of cognate data, employing Bayesian phylogenetic methods to infer a dated tree and to build a phylogeographic expansion model. Results suggest that the branch originated around 2500 BP in the area of the upper course of the Tapajós-Xingu basins, with a split between Southern and Northern varieties starting around 1750 BP. We analyse the difficulties in reconciling archaeological and linguistic data for this group, stressing the importance of developing an interdisciplinary unified model that incorporates evidence from both discipli...
2016
Recent phylogenetic studies in historical linguistics have focused on lexical data. However, the way that such data are coded into characters for phylogenetic analysis has been approached in different ways, without investigating how coding methods may affect the results. In this paper, we compare three different coding methods for lexical data (multistate meaning-based characters, binary root-meaning characters, and binary cognate characters) in a Bayesian framework, using data from the Tupi-Guarani and Chapacuran language families as case studies. We show that, contrary to prior expectations, different coding methods can have a significant impact on the topology of the resulting trees. Keywords—Bayesian phylogenetic inference, cognate coding, historical linguistics, South American indigenous languages
Recent phylogenetic studies in historical linguistics have focused on lexical data. However, the way that such data are coded into characters for phylogenetic analysis has been approached in different ways, without investigating how coding methods may affect the results. In this paper, we compare three different coding methods for lexical data (multistate meaning-based characters, binary root-meaning characters, and binary cognate characters) in a Bayesian framework, using data from the Tupí-Guaraní and Chapacuran language families as case studies. We show that, contrary to prior expectations, different coding methods can have a significant impact on the topology of the resulting trees.
Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 2015
At least 40 spoken languages form the large tupi family in its subfamilies tupi-Guaraní, Mawé, Aweti, Arikém, Juruna, Mondé, tupari, Mundurukú, ramarama and Puruborá, providing a wealth of data for linguistic studies about variation-variation explained by genetic relations (common origin, ultimately from the presumed language 'proto-tupi') or by contact relations with other indigenous or non-indigenous languages. the interest in indigenous languages has increased in recent years, evinced by the publication of a growing number of descriptive, historical-comparative and other studies. Most studies published in this volume originate from a linguistic symposium organized by Wolf dietrich and sebastian drude at the 54 th international Congress of Americanists in Vienna, in 2012. the symposium was dedicated to "Historical variation and variation by contact among the tupian languages". those studies that deal with problems of linguistic genealogy, genetics, language change, and syntactic typology across several tupi languages or a single tupi language are published in this special "dossier". the first six papers deal with problems of the whole tupi family or at least one of its sub-families. three of them investigate evolutionary topics; three are cross-linguistic synchronic studies. Eduardo dos santos and colleagues from the area of human genetics, in their paper "origins and demographic dynamics of tupi expansion: a genetic tale", use recent genetic data in order to show that the Madeira-Guaporé region may indeed be considered to be the tupi homeland. Ancient tupi expansion within the Madeira-Guaporé region and dispersion to other south American areas seems to be related to patrilocal practices. this outcome allows for new interpretations of archaeological and linguistic data, for instance the dispersion of female associated technologies like ceramics and terminologies related to ceramics. Ana Vilacy Galucio and colleagues from the tupi Comparative Project, in their paper "on the genetic relationship and degree of relatedness with the tupi linguistic family", present the first lexicostatistical and phylogenetic attempt of the genetic classification of most languages of the tupi family, including four languages of the tupi-Guarani branch. Based on all relevant previous studies of particular branches of the family, and applying lexicostatistics to a semantically based word list,, the article demonstrates that the two major branches of the state of rondônia, Mondé and tuparí, have high percentages of shared cognates. this supports the results of the first article of this volume. the paper by sérgio Meira and sebastian drude gives an overview of their reconstruction of the segmental phonology of Proto-Maweti-Guarani, the hypothetical proto-language from which modern Mawé, Aweti, and the
In this paper we present the first results of the application of computational methods, inspired by the ideas in McMahon & McMahon (2005), to a dataset collected from languages of every branch of the Tupian family (including all living non-Tupí- Guaraní languages) in order to produce a classification of the family based on lexical distance. We used both a Swadesh list (with historically stabler terms) and a list of animal and plant names for results comparison. In addition, we also selected more (HiHi) and less (LoLo) stable terms from the Swadesh list to form sublists for indepedent treatment. We compared the resulting NeighborNet networks and neighbor-joining cladograms and drew conclusions about their significance for the current understanding of the classification of Tupian languages. One important result is the lack of support for the currently discussed idea of an Eastern-Western division within Tupí.
Diachronica 30, 2013
Encouraged by ongoing discussion of the classi cation of the Uralic languages, we investigate the family quantitatively using Bayesian phylogenetics and basic vocabulary from seventeen languages. To estimate the heterogeneity within this family and the robustness of its subgroupings, we analyse ten divergent sets of basic vocabulary, including basic vocabulary lists from the literature, lists that exclude borrowing-susceptible meanings, lists with varying degrees of borrowing-susceptible meanings and a list combining all of the examined items.
Diachronica, 2013
Encouraged by ongoing discussion of the classification of the Uralic languages, we investigate the family quantitatively using Bayesian phylogenetics and basic vocabulary from seventeen languages. To estimate the heterogeneity within this family and the robustness of its subgroupings, we analyse ten divergent sets of basic vocabulary, including basic vocabulary lists from the literature, lists that exclude borrowing-susceptible meanings, lists with varying degrees of borrowing-susceptible meanings and a list combining all of the examined items. The results show that the Uralic phylogeny has a fairly robust shape from the perspective of basic vocabulary, and is not dramatically altered by borrowing-susceptible meanings. The results differ to some extent from the ‘standard paradigm’ classification of these languages, such as the lack of firm evidence for Finno-Permian.
Proceedings of the ACM India Joint International Conference on Data Science and Management of Data
Cognates are present in multiple variants of the same text across different languages. Computational Phylogenetics uses algorithms and techniques to analyze these variants and infer phylogenetic trees for a hypothesized accurate representation based on the output of the computational algorithm used. In our work, we detect cognates among a few Indian languages namely Hindi, Marathi, Punjabi, and Sanskrit for helping build cognate sets for phylogenetic inference. Cognate detection helps phylogenetic inference by helping isolate diachronic sound changes and thus detect the words of a common origin. A cognate set manually annotated with the help of a lexicographer is generally used to automatically infer phylogenetic trees. Our work creates cognate sets of each language pair and infers phylogenetic trees based on a bayesian framework using the Maximum likelihood method. We also implement our work to an online interface and infer phylogenetic trees based on automatically detected cognate sets. The online interface helps create phylogenetic trees based on the textual data provided as an input. It helps a lexicographer provide manual input of data, edit the data based on their expert opinion and eventually create phylogenetic trees based on various algorithms including our work on automatically creating cognate sets. We go on to discuss the nuances in detection cognates with respect to these Indian languages and also discuss the categorization of Cognate words i.e., "Tatasama" and "Tadbhava" words.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Diachronica, 2013
International Journal of American Linguistics, 2016
International Journal of American Linguistics, 2016
Computer-Assisted Language Comparison in Practice, 2021
Journal of Language Relationship, 2022
LIAMES (Línguas Indígenas Americanas), 2022
Revista LinguíStica, 2017
WSEAS Transactions on Computers, 2019