Skip to main content

Malamatenia Vlachou Efstathiou

Ecole Pratique des Hautes Etudes, Histoire, textes et documents (HTD), Graduate Student

École nationale des chartes, Humanités numériques, Graduate Student

Université Paris-Sorbonne (Paris IV), Lettres Classiques (Latin), Alumnus

Followers

31

Following

99

Co-author

1

Public Views

PhD Candidate in Digital IRHT-LIGM | Ηumanités Numériques et Paléographie Latine ENC-PSL
Supervisors: Dominique Stutzmann (IRHT) and Mathieu Aubry (ENPC/LIGM)
Address: IRHT, 14 cr. des Humanités, 93300 Aubervilliers, France

less

Roberto Luigi Pagani

University of Iceland

Dominique Stutzmann

Centre National de la Recherche Scientifique / French National Centre for Scientific Research

Desiree Scholten

University of Cambridge

Estelle Guéville

Yale University

Lasse Mårtensson

King's College London

Adinel C . Dinca

Babes-Bolyai University

University of Pennsylvania

InterestsView All (8)

Uploads

Papers by Malamatenia Vlachou Efstathiou

An Interpretable Deep Learning Approach for Morphological Script Type Analysis

ICDAR 2024, Part II, LNCS 14936, p. 1-23., 2024

Defining script types and establishing classification criteria for medieval handwriting is a cent... more Defining script types and establishing classification criteria for medieval handwriting is a central aspect of palaeographical analysis. However, existing typologies often encounter methodological challenges, such as descriptive limitations and subjective criteria. We propose an interpretable deep learning-based approach to morphological script type analysis, which enables systematic and objective analysis and contributes to bridging the gap between qualitative observations and quantitative measurements. More precisely, we adapt a deep instance segmentation method to learn comparable character prototypes, representative of letter morphology, and provide qualitative and quantitative tools for their comparison and analysis. We demonstrate our approach by applying it to the Textualis Formata script type and its two subtypes formalized by A. Derolez: Northern and Southern Textualis.

CREMMA Medii Aevi

Zenodo (CERN European Organization for Nuclear Research), Jan 6, 2023

CATMuS-Medieval: Consistent Approaches to Transcribing ManuScripts

Livret des "Journées d'étude des jeunes chercheurs in memoriam Louis Holtz", 27-28 juin 2023, École nationale des chartes, Paris

by Malamatenia Vlachou Efstathiou and Angeliki BOIKOU

Comité d'organisation des journées "Chroniques Chartistes": Malamatenia Vlachou-Efstathiou (ENC/I... more

CREMMA Medii Aevi: Literary Manuscript Text Recognition in Latin

Journal of open humanities data, Nov 2, 2022

This paper presents a novel segmentation and handwritten text recognition dataset for Medieval La... more This paper presents a novel segmentation and handwritten text recognition dataset for Medieval Latin, from the 11 th to the 16 th century. It connects with Medieval French datasets as well as earlier Latin datasets, by enforcing common guidelines, bringing 263,000 new characters and now totaling over a million characters for medieval manuscripts in both languages. We provide our own addition to Ariane Pinche's Old French guidelines to deal with specific Latin cases. We also offer an overview of how we addressed this dataset compilation through the use of pre-existing resources. With a higher abbreviation ratio and a better representation of abbreviating marks, we offer new models that outperform the Old French base model on Latin datasets, improving accuracy by 5% on unknown Latin manuscripts.

CREMMA Medieval Latin: Literary manuscript text recognition in Latin

Le Centre pour la Communication Scientifique Directe - HAL - Inria, Oct 25, 2022

This paper presents a novel segmentation and handwritten text recognition dataset for Medieval La... more This paper presents a novel segmentation and handwritten text recognition dataset for Medieval Latin, from the 11 th to the 16 th century. It connects with Medieval French datasets as well as earlier Latin datasets, by enforcing common guidelines, bringing 263,000 new characters and now totaling over a million characters for medieval manuscripts in both languages. We provide our own addition to Ariane Pinche's Old French guidelines to deal with specific Latin cases. We also offer an overview of how we addressed this dataset compilation through the use of pre-existing resources. With a higher abbreviation ratio and a better representation of abbreviating marks, we offer new models that outperform the Old French base model on Latin datasets, improving accuracy by 5% on unknown Latin manuscripts.

Drafts by Malamatenia Vlachou Efstathiou

« Éditer les manuscrits grammaticaux glosés : solutions numériques face aux défis paléographiques : Le cas de la tradition manuscrite glosée d’Eutychès grammaticus »

Generally, glossed grammatical manuscripts pose a significant challenge to researchers who wish t... more Generally, glossed grammatical manuscripts pose a significant challenge to researchers who wish to edit them. Their heterogeneous character reflected in the structural specificities of their mise en page, alongside often multiple layers of annotations and orthographical variation, render their modeling difficult. Digital tools, as recent projects attest, offer a variety of solutions that reveal particularly efficient and thus necessary for their edition, from HTR and controlled vocabularies to XML-TEI and the visualization of the data. Choosing a dataset comprised of three glossed manuscripts (the Vossianus Latinus O41, Bamberg Msc 30 and BnF Latin 7499, that bears Rémi d'Auxerre's commentary in marge), as well as a glossary (BnF Latin 14087) of Eutyches'de uerbo as a case study, we'll try to apply a semi-automatized pipeline towards a multifunctional documentary edition, from the acquisition to the analysis of the data.

« illa circa Latinitatem : Questions de normativité dans le Chapitre 1,15 de l’ Ars Grammatica de Charisius »

Mémoire de M2 soutenu à la Sorbonne en juillet 2021

Talks by Malamatenia Vlachou Efstathiou

"Éditer les manuscrits grammaticaux glosés : solutions numériques face aux défis paléographiques – Le cas de la tradition manuscrite glosée d’Eutychès grammaticus" aux P'tits Déj' des Humanités Numériques à l'IRHT - 24/11/2023

Support de l'intervention "Éditer les manuscrits grammaticaux glosés : solutions numériques face ... more

An Interpretable Deep Learning Approach for Morphological Script Type Analysis

ICDAR 2024, Part II, LNCS 14936, p. 1-23., 2024

Defining script types and establishing classification criteria for medieval handwriting is a cent... more Defining script types and establishing classification criteria for medieval handwriting is a central aspect of palaeographical analysis. However, existing typologies often encounter methodological challenges, such as descriptive limitations and subjective criteria. We propose an interpretable deep learning-based approach to morphological script type analysis, which enables systematic and objective analysis and contributes to bridging the gap between qualitative observations and quantitative measurements. More precisely, we adapt a deep instance segmentation method to learn comparable character prototypes, representative of letter morphology, and provide qualitative and quantitative tools for their comparison and analysis. We demonstrate our approach by applying it to the Textualis Formata script type and its two subtypes formalized by A. Derolez: Northern and Southern Textualis.

CREMMA Medii Aevi

Zenodo (CERN European Organization for Nuclear Research), Jan 6, 2023

CATMuS-Medieval: Consistent Approaches to Transcribing ManuScripts

Livret des "Journées d'étude des jeunes chercheurs in memoriam Louis Holtz", 27-28 juin 2023, École nationale des chartes, Paris

by Malamatenia Vlachou Efstathiou and Angeliki BOIKOU

Comité d'organisation des journées "Chroniques Chartistes": Malamatenia Vlachou-Efstathiou (ENC/I... more

CREMMA Medii Aevi: Literary Manuscript Text Recognition in Latin

Journal of open humanities data, Nov 2, 2022

This paper presents a novel segmentation and handwritten text recognition dataset for Medieval La... more This paper presents a novel segmentation and handwritten text recognition dataset for Medieval Latin, from the 11 th to the 16 th century. It connects with Medieval French datasets as well as earlier Latin datasets, by enforcing common guidelines, bringing 263,000 new characters and now totaling over a million characters for medieval manuscripts in both languages. We provide our own addition to Ariane Pinche's Old French guidelines to deal with specific Latin cases. We also offer an overview of how we addressed this dataset compilation through the use of pre-existing resources. With a higher abbreviation ratio and a better representation of abbreviating marks, we offer new models that outperform the Old French base model on Latin datasets, improving accuracy by 5% on unknown Latin manuscripts.

CREMMA Medieval Latin: Literary manuscript text recognition in Latin

Le Centre pour la Communication Scientifique Directe - HAL - Inria, Oct 25, 2022

This paper presents a novel segmentation and handwritten text recognition dataset for Medieval La... more This paper presents a novel segmentation and handwritten text recognition dataset for Medieval Latin, from the 11 th to the 16 th century. It connects with Medieval French datasets as well as earlier Latin datasets, by enforcing common guidelines, bringing 263,000 new characters and now totaling over a million characters for medieval manuscripts in both languages. We provide our own addition to Ariane Pinche's Old French guidelines to deal with specific Latin cases. We also offer an overview of how we addressed this dataset compilation through the use of pre-existing resources. With a higher abbreviation ratio and a better representation of abbreviating marks, we offer new models that outperform the Old French base model on Latin datasets, improving accuracy by 5% on unknown Latin manuscripts.

« Éditer les manuscrits grammaticaux glosés : solutions numériques face aux défis paléographiques : Le cas de la tradition manuscrite glosée d’Eutychès grammaticus »

Generally, glossed grammatical manuscripts pose a significant challenge to researchers who wish t... more Generally, glossed grammatical manuscripts pose a significant challenge to researchers who wish to edit them. Their heterogeneous character reflected in the structural specificities of their mise en page, alongside often multiple layers of annotations and orthographical variation, render their modeling difficult. Digital tools, as recent projects attest, offer a variety of solutions that reveal particularly efficient and thus necessary for their edition, from HTR and controlled vocabularies to XML-TEI and the visualization of the data. Choosing a dataset comprised of three glossed manuscripts (the Vossianus Latinus O41, Bamberg Msc 30 and BnF Latin 7499, that bears Rémi d'Auxerre's commentary in marge), as well as a glossary (BnF Latin 14087) of Eutyches'de uerbo as a case study, we'll try to apply a semi-automatized pipeline towards a multifunctional documentary edition, from the acquisition to the analysis of the data.

« illa circa Latinitatem : Questions de normativité dans le Chapitre 1,15 de l’ Ars Grammatica de Charisius »

Mémoire de M2 soutenu à la Sorbonne en juillet 2021

"Éditer les manuscrits grammaticaux glosés : solutions numériques face aux défis paléographiques – Le cas de la tradition manuscrite glosée d’Eutychès grammaticus" aux P'tits Déj' des Humanités Numériques à l'IRHT - 24/11/2023

Support de l'intervention "Éditer les manuscrits grammaticaux glosés : solutions numériques face ... more