Papers by Malamatenia Vlachou Efstathiou

ICDAR 2024, Part II, LNCS 14936, p. 1-23., 2024
Defining script types and establishing classification criteria for medieval handwriting is a cent... more Defining script types and establishing classification criteria for medieval handwriting is a central aspect of palaeographical analysis. However, existing typologies often encounter methodological challenges, such as descriptive limitations and subjective criteria. We propose an interpretable deep learning-based approach to morphological script type analysis, which enables systematic and objective analysis and contributes to bridging the gap between qualitative observations and quantitative measurements. More precisely, we adapt a deep instance segmentation method to learn comparable character prototypes, representative of letter morphology, and provide qualitative and quantitative tools for their comparison and analysis. We demonstrate our approach by applying it to the Textualis Formata script type and its two subtypes formalized by A. Derolez: Northern and Southern Textualis.
Zenodo (CERN European Organization for Nuclear Research), Jan 6, 2023
Comité d'organisation des journées "Chroniques Chartistes": Malamatenia Vlachou-Efstathiou (ENC/I... more Comité d'organisation des journées "Chroniques Chartistes": Malamatenia Vlachou-Efstathiou (ENC/IRHT), Hugo Forster (ENC), Svetlana Yatsyk (CIHAM)
Journal of open humanities data, Nov 2, 2022
This paper presents a novel segmentation and handwritten text recognition dataset for Medieval La... more This paper presents a novel segmentation and handwritten text recognition dataset for Medieval Latin, from the 11 th to the 16 th century. It connects with Medieval French datasets as well as earlier Latin datasets, by enforcing common guidelines, bringing 263,000 new characters and now totaling over a million characters for medieval manuscripts in both languages. We provide our own addition to Ariane Pinche's Old French guidelines to deal with specific Latin cases. We also offer an overview of how we addressed this dataset compilation through the use of pre-existing resources. With a higher abbreviation ratio and a better representation of abbreviating marks, we offer new models that outperform the Old French base model on Latin datasets, improving accuracy by 5% on unknown Latin manuscripts.
Le Centre pour la Communication Scientifique Directe - HAL - Inria, Oct 25, 2022
This paper presents a novel segmentation and handwritten text recognition dataset for Medieval La... more This paper presents a novel segmentation and handwritten text recognition dataset for Medieval Latin, from the 11 th to the 16 th century. It connects with Medieval French datasets as well as earlier Latin datasets, by enforcing common guidelines, bringing 263,000 new characters and now totaling over a million characters for medieval manuscripts in both languages. We provide our own addition to Ariane Pinche's Old French guidelines to deal with specific Latin cases. We also offer an overview of how we addressed this dataset compilation through the use of pre-existing resources. With a higher abbreviation ratio and a better representation of abbreviating marks, we offer new models that outperform the Old French base model on Latin datasets, improving accuracy by 5% on unknown Latin manuscripts.
Drafts by Malamatenia Vlachou Efstathiou

Generally, glossed grammatical manuscripts pose a significant challenge to researchers who wish t... more Generally, glossed grammatical manuscripts pose a significant challenge to researchers who wish to edit them. Their heterogeneous character reflected in the structural specificities of their mise en page, alongside often multiple layers of annotations and orthographical variation, render their modeling difficult. Digital tools, as recent projects attest, offer a variety of solutions that reveal particularly efficient and thus necessary for their edition, from HTR and controlled vocabularies to XML-TEI and the visualization of the data. Choosing a dataset comprised of three glossed manuscripts (the Vossianus Latinus O41, Bamberg Msc 30 and BnF Latin 7499, that bears Rémi d'Auxerre's commentary in marge), as well as a glossary (BnF Latin 14087) of Eutyches'de uerbo as a case study, we'll try to apply a semi-automatized pipeline towards a multifunctional documentary edition, from the acquisition to the analysis of the data.
Mémoire de M2 soutenu à la Sorbonne en juillet 2021
Talks by Malamatenia Vlachou Efstathiou
Uploads
Papers by Malamatenia Vlachou Efstathiou
Drafts by Malamatenia Vlachou Efstathiou
Talks by Malamatenia Vlachou Efstathiou
https://www.irht.cnrs.fr/fr/agenda/seminaire/les-ptits-dej-humanites-numeriques-de-lirht
https://www.irht.cnrs.fr/fr/agenda/seminaire/les-ptits-dej-humanites-numeriques-de-lirht