Skip to main content
"This article presents a corpus-based study of the metaphorical and metonymical use of the words "head" and "heart," together with the Norwegian correspondents "hode" and "hjerte." The continuum between metaphor and metonymy is explored,... more
    • by 
    •   4  
      MetaphorMetonymyParallel CorporaENPC
This paper proposes a new approach for collecting lexical and grammatical data: one that meets the need to control the features to be elicited, while ensuring a fair level of idiomaticity. The method, called conversational questionnaires,... more
    • by 
    •   10  
      Languages and LinguisticsLanguage DocumentationEndangered LanguagesLexicography
Contrastive methods have long been employed in lexicography, in particular in bi-and multilingual dictionary projects. The main rationale for this is the necessity to comprehensively study, i.e. compare and contrast, two or more... more
    • by 
    •   31  
      LexicologyVocabularyTerminologyConceptual Modelling
We present HindEnCorp, a parallel corpus of Hindi and English, and HindMonoCorp, a monolingual corpus of Hindi in their release version 0.5. Both corpora were collected from web sources and preprocessed primarily for the training of... more
    • by 
    •   4  
      Computational LinguisticsMachine TranslationCorporaParallel Corpora
This essay presents the ERC project ‘Transmission of Classical Scientific and Philosophical Literature from Greek into Syriac and Arabic’ (HUNAYNNET) based at the Institute for Medieval Research of the Austrian Academy of Sciences. The... more
    • by  and +1
    •   14  
      Late Antique and Byzantine StudiesTranslation StudiesDigital HumanitiesHistory of Medicine
In this thesis we describe and evaluate a tool for automatic generation of translations for multiword English terms into Spanish from a monolingual specialized Spanish corpus, compiled by means of web crawling. The resulting translations... more
    • by 
    •   15  
      Machine TranslationTerminologyLexical SemanticsLexicography
    • by 
    •   8  
      Spanish LinguisticsItalian (Languages And Linguistics)Modern Greek LanguageParallel Corpora
We report on a project to annotate biblical texts in order to create an aligned multilingual Bible corpus for linguistic research, particularly computational linguistics, including automatically creating and evaluating translation... more
    • by 
    •   5  
      Cognitive ScienceComputational linguistic phylogeneticsParallel CorporaCorpus Annotation
This paper describes the first phase of the CEXI project at the University of Bologna in Forlì, involving the selection of the texts to be included in the corpus and decisions about the processing of these texts. The aim of the project is... more
    • by 
    •   3  
      TranslationParallel CorporaCorpus-Based Translation
    • by 
    •   5  
      TerminologyBologna ProcessCorpus LinguisticsSpecialized translation
Automatic extraction of bilingual lexicons from parallel corpora has been recently exploited to overcome the knowledge acquisition bottleneck in a number of research areas in natural language processing, such as machine translation (MT)... more
    • by 
    •   4  
      Natural Language ProcessingComputational LinguisticsParallel CorporaArabic Machine Translation
The feature selection or extraction is the most important task in Opinion mining and Sentimental Analysis (OSMA) for calculating the polarity score. These scores are used to determine the positive, negative, and neutral polarity about... more
    • by 
    •   18  
      Information RetrievalNatural Language ProcessingMachine LearningData Mining
The paper presents an attempt to propose an exact method for identifying the so-called "language-specific" lexicon, a controversial notion often reasonably questioned. An aligned bilingual parallel corpus is chosen as an instrument for... more
    • by 
    •   3  
      Lexical SemanticsParallel CorporaLexical Typology
The aim of this paper is to investigate Polish equivalents of English phrasal verbs as found in an English-Polish (E-P) parallel corpus PHRAVERB. Given the semantic idiosyncrasy exhibited by phrasal verbs, it is assumed that the... more
    • by 
    •   4  
      LexicographyPhrasal VerbsParallel CorporaEquivalence in Translation
This paper concentrates on the verbal moods used after Spanish adverbs expressing potentiality (quizá(s), tal vez, probablemente, posiblemente). With the use of the corpus CREA, we sought to determine whether there is a preference for... more
    • by 
    •   10  
      Translation StudiesSpanishModalityCorpus Linguistics
Translation is a profession highly connected to technology, and for this reason, most of today's translators are in contact with a variety of tools, services and programs, such as word processors, e-mail, electronic dictionaries, among... more
    • by  and +1
    •   5  
      UsabilityErgonomicsCorpus Linguistics and Translation StudiesCorpus-Based Translation Studies
In this paper, syntactic annotation is used to reveal linguistic properties of translations. We employ the Universal Dependencies framework to represent learner and professional translations of English mass-media texts into Russian (along... more
    • by  and +1
    •   5  
      Translation StudiesTranslation QualityParallel CorporaDependency Parsing
The sentences in the RNC are aligned sentence -by -sentence. The texts kindly offered for the use in the RNC by Adrian Barentsen and included into the Amsterdam Slavic Parallel Aligned Corpus multilingual corpus are already aligned... more
    • by 
    •   4  
      Slavic LanguagesCorpus LinguisticsRussian LanguageParallel Corpora
The present paper is about the project of Russian Learner Translator Corpus, which is currently under development. The paper discusses the feasibility of such a corpus and existing analogues, describes the current status of corpus... more
    • by  and +1
    •   6  
      Translation StudiesCorpus LinguisticsLearner corporaParallel Corpora
    • by 
    •   2  
      Machine TranslationParallel Corpora
(Draft with minor differences to published version)
    • by 
    •   6  
      Historical LinguisticsSlavic LanguagesCorpus LinguisticsSlavic Historical Linguistics
The paper describes three studies concerned with inner-Slavic variation in the use of different functional categories, two of which involve verbal aspect and one of which involves reflexive coding. The leading interest behind this... more
    • by 
    •   4  
      Corpus LinguisticsSlavic LinguisticsLinguistic TypologyParallel Corpora
In this paper we present a method for term extraction that can be used in classroom with translation students. The terms are extracted from a multilingual parallel corpus with the aid of a parallel concordancer, AntPConc. Our work is... more
    • by 
    •   4  
      Translation StudiesTerminologyParallel CorporaMedical Terminology
reading and commenting on a draft of this paper. 2 There is no published account on this corpus; for an example of work with it, see 3 See
    • by 
    •   4  
      Translation StudiesCorpus LinguisticsSlavic LinguisticsParallel Corpora
This study investigates formal and functional variation of analytic causatives (ACs) in eighteen European languages from the Indo-European and Uralic language families. Employing the comparative concept approach to language comparison,... more
    • by 
    •   8  
      Languages and LinguisticsMultivariate StatisticsLinguisticsFunctional Linguistics
    • by 
    •   5  
      Word alignmentParallel CorporaHybrid ApproachEdit Distance
    • by 
    •   2  
      Corpus LinguisticsParallel Corpora
Nous proposons une méthode de découverte et de compilation des normes de traduction des concepts spécialisés employés dans des termes simples et complexes attestés dans un corpus parallèle bilingue. Les normes de traduction mises au jour... more
    • by 
    •   8  
      Translation StudiesCorpus LinguisticsCorpus Linguistics and Discourse AnalysisCorpus Linguistics and Translation Studies
    • by 
    •   2  
      Machine TranslationParallel Corpora
La lingüística histórica, en su camino hacia la consagración como disciplina autónoma, no ha podido, o no ha querido, distanciarse de las corrientes anejas que transitan y evolucionan en el seno de una lingüística más general y... more
    • by 
    •   7  
      Translation StudiesHistorical LinguisticsCorpus LinguisticsTraducción
Polysemy is a key issue in theoretical semantics and lexicography as well as in computational linguistics. When words have several senses, it is important to describe them properly in the dictionary (a lexicographic task) and to be able... more
    • by  and +2
    •   11  
      SemanticsEnglish languageWord Sense DisambiguationLexical Semantics
The present study investigates the cross-linguistic differences in the use of so-called T/V forms (e.g. French tu and vous, German du and Sie, Russian ty and vy) in ten European languages from different language families and genera. These... more
    • by 
    •   6  
      Multivariate StatisticsPolitenessAudiovisual TranslationParallel Corpora
In this study we examine the occurrences and correspondences of terms for blood kinship in a Bulgarian–Ukrainian parallel corpus of fiction. All instances of the terms selected for study, matching and non-matching, were located and... more
    • by  and +1
    •   11  
      Translation StudiesSemanticsSlavic LanguagesCorpus Linguistics
The Algerian Arabic dialects are under-resourced languages, which lack both corpora and Natural Language Processing (NLP) tools, although they are increasingly used in written form, especially on social media and forums. We aim through... more
    • by  and +3
    •   3  
      Statistical Machine TranslationArabic DialectsParallel Corpora
Multilingual corpora, containing the same documents in a variety of languages, are becoming an essential resource for natural language processing. Clustering multilingual corpora provides us with an insight into the differences between... more
    • by 
    •   3  
      Cross-Language Information Retrieval (CLIR)Hierarchical Agglomerative ClusteringParallel Corpora
The merging of corpus linguistic methods and digital technology can provide new ways of representing medieval digital texts. In this paper, we introduce a multi-layered parallel Old Occitan-English corpus. We show how parallel alignment... more
    • by 
    •   10  
      Sentiment AnalysisCorpus LinguisticsMedieval Occitan LiteratureAnnotation
Accessing historical texts is often a challenge because readers either do not know the historical language, or they are challenged by the technological hurdle when such texts are available digitally. Merging corpus linguistic methods and... more
    • by 
    •   9  
      Digital HumanitiesHistorical LinguisticsVisualizationComputational Linguistics
The paper presents parallel corpora within the Russian National Corpus (RNC) featuring Circum-Baltic/Russian language pairs and describes the choice of texts, morphological annotation and possible applications. The following languages of... more
    • by  and +1
    •   3  
      Corpus compilation and designParallel CorporaCircum-Baltic languages
In this study we examine the metaphoric mentions of three wild animals considered to be most important in the Slavic popular tradition, namely the wolf, the bear and the hare, in a Bulgarian–Ukrainian parallel corpus. Our goal is to see... more
    • by  and +1
    •   13  
      Translation StudiesSemanticsSlavic LanguagesCorpus Linguistics
Demand for Chinese-to-English translation has increased over recent years. In contrast, resources for training translators for Chinese-to-English are few although increasing now, relative to English-to-Chinese for example. Corpus-based... more
    • by 
    •   8  
      Science and TechnologyQuantitative analysisCorpus linguisticParallel Corpora
Canonical question tags feature prominently in spoken English, where they display great versatility. At face value they are meant to elicit a response from a co-participant in the form of (dis)agreement with the proposition to which the... more
    • by 
    •   4  
      PragmaticsTag QuestionsParallel CorporaContrastive Linguistics
The connective because can express both highly objective and highly subjective causal relations. In this, it differs from its counterparts in other languages, e.g. Dutch, where two conjunctions omdat and want express more objective and... more
    • by  and +1
    •   4  
      PragmaticsLogistic RegressionDiscourse ConnectivesParallel Corpora
    • by  and +2
    •   17  
      Information RetrievalTranslation StudiesNatural Language ProcessingMachine Learning
This paper describes the characteristics of Brazilian Portuguese (BP) thetic sentences by means of a parallel corpus study consisting of the original dialogues of two Argentinean movies from 2004 and 2010 and the corresponding doubling... more
    • by 
    •   5  
      Information structure (Languages And Linguistics)Brazilian PortugueseParallel CorporaTheticity
The paper relates about our ongoing work on the creation of a corpus of Bulgarian and Ukrainian parallel texts. We discuss some differences in the approaches and the interpretation of some concepts, as well as various problems associated... more
    • by  and +1
    •   6  
      Translation StudiesSlavic LanguagesCorpus LinguisticsBulgarian Language
This paper presents a comparative bilingual corpus-based study of the use of several frequent temporal adverbs and adverbial expressions (‘always’, ‘sometimes’, ‘never’ and their synonyms) in Bulgarian and Ukrainian. The Ukrainian items... more
    • by  and +1
    •   11  
      Translation StudiesSemanticsSlavic LanguagesCorpus Linguistics
It is well known that word aligned parallel corpora are valuable linguistic resources. Since many factors affect automatic alignment quality, manual post-editing may be required in some applications. While there are several... more
    • by  and +1
    •   4  
      Machine TranslationCross-language ProcessingWord alignmentParallel Corpora
Generating source code API sequences from an English query using Machine Translation (MT) has gained much interest in recent years. For any kind of MT, the model needs to be trained on a parallel corpus. In this paper we clean... more
    • by 
    •   2  
      Software EngineeringParallel Corpora
Anotace Several parallel corpora built from European Union language resources are presented here. They were processed by state-of-the-art tools and made available for researchers in the Sketch Engine corpus management system. A completely... more
    • by 
    •   3  
      Corpus LinguisticsParallel CorporaSketch Engine