Papers by Olga Yannoutsou
The present article presents the ontologies developed for a web service platform, which facilitat... more The present article presents the ontologies developed for a web service platform, which facilitates the interconnection between independent enterprise systems and specifically between an ERP (Enterprise Resource Planning) and a WFMS (Workflow Management System). The ...
In this paper we describe the METIS-II system and its evaluation on each of the language pairs: D... more In this paper we describe the METIS-II system and its evaluation on each of the language pairs: Dutch, German, Greek, and Spanish to English. The METIS-II system envisaged developing a data-driven approach in which no parallel corpus is required and in which no full parser or extensive rule sets are needed. We describe the evaluation on a development test set and on a test set taken from Europarl, and compare our results with SYSTRAN. We also provide some further analysis, namely researching the impact of the number and source of the reference translations and analysing the results according to test text type. The results are expectably lower for the METIS system, but not at an unattainable distance from a mature system like SYSTRAN.
Proceedings of EAMT …, 2003
IEEE Transactions on Medical Imaging, 2007
METIS-II, the MT system presented in this paper, does not view translation as a transfer process ... more METIS-II, the MT system presented in this paper, does not view translation as a transfer process between a source language (SL) and a target one (TL), but rather as a matching procedure of patterns within a language pair. More specifically, translation is considered to be an assignment problem, i.e. a problem of discovering each time the best matching patterns between SL and TL, which the system is called to solve by employing patternmatching techniques. Most importantly, however, METIS-II is innovative because it does not need bilingual corpora for the translation process, but exclusively relies on monolingual corpora of the target language.
In this article the principles of the METIS Machine Translation system are pre-sented. METIS empl... more In this article the principles of the METIS Machine Translation system are pre-sented. METIS employs an extensive tagged and lemmatised corpus of texts in the target language, coupled with bilin-gual lexica covering the desired pairs of source-target languages. To generate a high-quality translation, the METIS sys-tem is provided with statistical tools ena-bling it to extract linguistic knowledge from the annotated corpus of the target language. The advantage of this approach is that, although no grammars need to be provided explicitly, grammatical transla-tions are retrieved from the corpus by us-ing pattern matching techniques. 1

Using Comparable Corpora for Under-Resourced Areas of Machine Translation, 2019
The availability of parallel corpora is limited, especially for under-resourced languages and nar... more The availability of parallel corpora is limited, especially for under-resourced languages and narrow domains. On the other hand, the number of comparable documents in these areas that are freely available on the Web is continuously increasing. Algorithmic approaches to identify these documents from the Web are needed for the purpose of automatically building comparable corpora for these under-resourced languages and domains. How do we identify these comparable documents? What approaches should be used in collecting these comparable documents from different Web sources? In this chapter, we firstly present a review of previous techniques that have been developed for collecting comparable documents from the Web. Then we describe in detail three new techniques to gather comparable documents from three different types of Web sources: Wikipedia, news articles, and narrow domains.
In this paper an innovative approach is presented for MT, which is based on pattern matching tech... more In this paper an innovative approach is presented for MT, which is based on pattern matching techniques, relies on extensive target language monolingual corpora and employs a series of similarity weights between the source and the target language. Our system is based on the notion of ‘patterns’, which are viewed as ‘models’ of target language strings, whose final form is defined by the corpus.
This paper presents the activities of Euromat (European Machine Translation) office in Greece, wh... more This paper presents the activities of Euromat (European Machine Translation) office in Greece, which has been functioning as a centre for Machine Translation Services for the Greek Public Sector since 1994. It describes the user profile, his/her attitude towards MT, strategies of promotion and the collected corpus for the first three years. User data were collected by questionnaires, interviews and corpus statistics. The general conclusions which have come out from our surveys are discussed.
Lecture Notes in Computer Science, 1998
This paper presents the activities of Euromat (European Machine Translation) office in Greece, wh... more This paper presents the activities of Euromat (European Machine Translation) office in Greece, which has been functioning as a centre for Machine Translation Services for the Greek Public Sector since 1994. It describes the user profile, his/her attitude towards MT, strategies of promotion and the collected corpus for the first three years. User data were collected by questionnaires, interviews and

Lecture Notes in Computer Science, 2006
The innovative feature of the system presented in this paper is the use of pattern-matching techn... more The innovative feature of the system presented in this paper is the use of pattern-matching techniques to retrieve translations resulting in a flexible, language-independent approach, which employs a limited amount of explicit a priori linguistic knowledge. Furthermore, while all state-of-the-art corpus-based approaches to Machine Translation (MT) rely on bitexts, this system relies on extensive target language monolingual corpora. The translation process distinguishes three phases: 1) pre-processing with 'light' rule and statisticsbased NLP techniques 2) search & retrieval, 3) synthesising. At Phase 1, the source language sentence is mapped onto a lemma-to-lemma translated string. This string then forms the input to the search algorithm, which retrieves similar sentences from the corpus (Phase 2). This retrieval process is performed iteratively at increasing levels of detail, until the best match is detected. The best retrieved sentence is sent to the synthesising algorithm (Phase 3), which handles phenomena such as agreement.
Machine Translation, 2008
METIS-II was a EU-FET MT project running from October 2004 to September 2007, which aimed at tran... more METIS-II was a EU-FET MT project running from October 2004 to September 2007, which aimed at translating free text input without resorting to parallel corpora. The idea was to use 'basic' linguistic tools and representations and to link them with patterns and statistics from the monolingual target-language corpus. The METIS-II project has four partners, translating from their 'home' languages Greek, Dutch, German, and Spanish into English. The paper outlines the basic ideas of the project, their implementation, the resources used, and the results obtained. It also gives examples of how METIS-II has continued beyond its lifetime and the original scope of the project. On the basis of the results and experiences obtained, we believe that the approach is promising and offers the potential for development in various directions.
… : Proceedings of the …, 2007
In this paper, we explain why we have adopted pattern matching for MT purposes and why we have em... more In this paper, we explain why we have adopted pattern matching for MT purposes and why we have embedded it into a hybrid approach. "Patterns" here are understood as independent meaningful sub-sentential segments received in a systematic way. We describe the nature and size of the patterns used as well as the comparison algorithm developed. We discuss results obtained by matching patterns of different types and complexity in four different language pairs. Our experiments indicate that better results are obtained when matching the longest possible patterns.
MT Summit X, 2005
Monolingual Corpus-based MT using Chunks Stella Markantonatou1, Sokratis Sofianopoulos2, Vassilik... more Monolingual Corpus-based MT using Chunks Stella Markantonatou1, Sokratis Sofianopoulos2, Vassiliki Spilioti3, Yiorgos Tambouratzis4, Marina Vassiliou5, Olga Yannoutsou6, Nikos Ioannou7 Machine Translation Department, Institute for Language & ...

Translating and the …, 2000
This paper describes the attitude of a Greek public administration user group towards the EC SYST... more This paper describes the attitude of a Greek public administration user group towards the EC SYSTRAN machine translation (MT) system and its contribution to the system's evaluation and enhancement. EC SYSTRAN was first made available to the Greek public sector in 1994, under an initiative of the Greek Government in collaboration with the EC. Translation services are provided free of charge to the public sector and the system has unproved thanks to the evaluations performed and the terminology provided by the user group. Currently, the group has two sections: the end-users themselves and a team of linguists. The presentation concludes with some statistics on the number of users willing to assist in the enhancement of the MT system together with a description of their profile and an indication of users' reactions (positive/negative comments) to using machine translation services and language technology products.
TMI 2007, Jun 1, 2006
TMI-07 11 th International Conference on Theoretical and Methodological Issues in Machine Transla... more TMI-07 11 th International Conference on Theoretical and Methodological Issues in Machine Translation List of Authors Alison Alvarez 1 Sara Morrissey 214 Jesus Andrés-Ferrer 11 Hermann Ney 214 Darren Scott Appling 134 Eric Nichols 134 Eiji Aramaki 21 Stephan Oepen 144 Toni Badia 132 Kazuhiko Ohe 21 Francis Bond 134 Karolina Owczarzak 221 Chris Brew 122 Alexandre Patry 104 Matthias Buch-Kromann 31 Michael Paul 154 Michael Carl 41 Aaron Phillips 163 Marine Carpuat 43 Andrei Popescu-Belis 55 Francisco Casacuberta 11, 191 Victoria Rosen 144 ...
Proceedings of the …, 2006
With this work, we further explore the ideas tested within the METIS-I1 system (Dologlou et al. 2... more With this work, we further explore the ideas tested within the METIS-I1 system (Dologlou et al. 2003) which proved the feasibility of the innovative idea that sound translations could be received with hybrid MT that relied on monolingual corpora rather than parallel ones and flat ...
MT Summit X, 2005
Monolingual Corpus-based MT using Chunks Stella Markantonatou1, Sokratis Sofianopoulos2, Vassilik... more Monolingual Corpus-based MT using Chunks Stella Markantonatou1, Sokratis Sofianopoulos2, Vassiliki Spilioti3, Yiorgos Tambouratzis4, Marina Vassiliou5, Olga Yannoutsou6, Nikos Ioannou7 Machine Translation Department, Institute for Language & ...

Advances in Artificial …, Jan 1, 2006
The innovative feature of the system presented in this paper is the use of pattern-matching techn... more The innovative feature of the system presented in this paper is the use of pattern-matching techniques to retrieve translations resulting in a flexible, language-independent approach, which employs a limited amount of explicit a priori linguistic knowledge. Furthermore, while all state-of-the-art corpus-based approaches to Machine Translation (MT) rely on bitexts, this system relies on extensive target language monolingual corpora. The translation process distinguishes three phases: 1) pre-processing with 'light' rule and statisticsbased NLP techniques 2) search & retrieval, 3) synthesising. At Phase 1, the source language sentence is mapped onto a lemma-to-lemma translated string. This string then forms the input to the search algorithm, which retrieves similar sentences from the corpus (Phase 2). This retrieval process is performed iteratively at increasing levels of detail, until the best match is detected. The best retrieved sentence is sent to the synthesising algorithm (Phase 3), which handles phenomena such as agreement.
Uploads
Papers by Olga Yannoutsou