inproceedings by Timo Homburg
This publication proposes a best practice digital processing pipeline for cuneiform languages. Th... more This publication proposes a best practice digital processing pipeline for cuneiform languages. The pipeline includes the following steps: 1. Annotation of cuneiform tablet 3D scans, 2. Creation of transliterations in ATF and using PaleoCodage to capture cuneiform character variants, 3. Conversion and then annotation of transliterations using TEI (structurally, semantically and liguistically), 4. Creation of semantic dictionaries, 5. Export of the results in various formats to support the needs of many research communities. This poster shows how such a pipeline can be realized using a traditional Git versioning system and a variety of web-based tools assisting in the annotation and export.
In this publication we introduce a linked data powered application which assists users to find so... more In this publication we introduce a linked data powered application which assists users to find so-called Stolpersteine, stones commemorating Jewish victims of the second world war. We show the feasibility of a progressive web app using linked data resources and evaluate this app against local datasources to find out if the current linkeddata environment can equally and/or sufficiently support an application in this knowledge domain.

A major problem in the research for new artificial intelligence methods for workflows is the eval... more A major problem in the research for new artificial intelligence methods for workflows is the evaluation. There is a lack of large evaluation corpora. Existing methods manually model workflows or use workflow extraction to automatically extract workflows from text. Both existing approaches have limitations. The manual modeling of workflows requires a lot of human effort and it would be expensive to create a large test corpus. Workflow extraction is limited by the number of existing textual process descriptions and it is not guaranteed that the workflows are semantically correct. In this paper we suggest to set up a planning domain and apply a planner to create a large number of valid plans. Workflows can be derived from plans. The planner uses a semantic eligibility function to determine whether an operator can be applied to a resource or not. We present a first concept and a prototype implementation in the cooking workflow domain.
We present experiments on word segmentation for Akkadian cuneiform, an ancient writing system and... more We present experiments on word segmentation for Akkadian cuneiform, an ancient writing system and a language used for about 3 millennia in the ancient Near East. To our best knowledge, this is the first study of this kind applied to either the Akkadian language or the cuneiform writing system. As a logosyllabic writing system, cuneiform structurally resembles Eastern Asian writing systems, so, we employ word segmentation algorithms originally developed for Chinese and Japanese. We describe results of rule-based algorithms, dictionary-based algorithms, statistical and machine learning approaches. Our results may indicate possible promising steps in cuneiform word segmentation that can create and improve natural language processing in this area.
inbooks by Timo Homburg
In this paper we present a new concept of geospatial quality assurance that is currently planned ... more In this paper we present a new concept of geospatial quality assurance that is currently planned to be implemented in the German Federal Agency of Cartography and Geodesy. Linked open data is being enriched with Semantic Web data in order to create thematic maps relevant to the population. We evaluate the quality of such enriched maps using a standardized process and look at the possible impacts of enriching Semantic Web data with open data sets of the Federal Agency of Cartography and Geodesy.

In this paper we present a new way to evaluate geospatial data quality using Semantic technologie... more In this paper we present a new way to evaluate geospatial data quality using Semantic technologies. In contrast to non-semantic approaches to evaluate data quality, Semantic technologies allow us to model situations in which geospatial data may be used and to apply costumized geospatial data quality models using reasoning algorithms on a broad scale. We explain how to model data quality using common vocabularies of ontologies in various contexts, apply data quality results using reasoning in a real-world application case using OpenStreetMap as our data source and highlight the results of our findings on the example of disaster management planning for rescue forces. We contribute to the Semantic Web community and the OpenStreetMap community by proposing a semantic framework to combine usecase dependent data quality assignments which can be used as reasoning rules and as data quality assurance tools for both communities respectively.
phdtheses by Timo Homburg
miscs by Timo Homburg
This presentation includes recent effort to transliterate, 3D scan and annotate cuneiform texts w... more This presentation includes recent effort to transliterate, 3D scan and annotate cuneiform texts which have been excavated in Haft Tappeh in Iran recently. The history of the texts and artifacts is described and new digital methods to annotate and to register character variants are introduced to support the creation of a linked data results for cuneiform annotation projects.

Presentation Topic and State Of The Art On our poster we want to present ongoing work to create ... more Presentation Topic and State Of The Art On our poster we want to present ongoing work to create an automatic natural language processing tool for Hittite cuneiform. Hittite cuneiform texts are to this day manually transcribed by the respective experts and then published in a transliteration format (commonly ATF). Pictures of the original cuneiform tablet may be provided and more rarely cuneiform representations in Unicode are present. Due to recent advancements in the field (such as Cuneify) an automatic translation of many Hittite cuneiform transliterations to their respective cuneiform representation is possible. Research Contributions We build upon this work by creating tools that aim to automatically translate Hittite cuneiform texts to English from either a Unicode cuneiform representation or their transliteration representation. POSTagger We have created a morphological analyzer to detect nouns, verbs, several kinds of pronouns, their respective declinations and appendices as well as structural particles. On a sample set of annotated Hittite texts from different epochs in cuneiform and transliteration representation we have evaluated the morphological analyzer, its advantages, problems and possible solutions and intend to present the results as well as some POSTagging examples in section one of our poster. Dictionary Creation Dictionaries for Hittite cuneiform exist in often non-machine readable formats and without a connection to Semantic Web concepts. We intend to change this situation by parsing digitally available nonsemantic dictionaries and using matching algorithms to find concepts of the English translations of such dictionaries in the Semantic Web e.g. DBPedia or Wikidata. Dictionaries of this kind are stored using the Lexical Model for Ontologies (Lemon). In addition to freely available dictionaries we intend to use expert resources developed by the academy of sciences in Mainz/Germany to verify and extend our generated dictionaries. We intend to present the dictionary creation process, statistics about the content of generated dictionaries and their impact in section two of our poster. Machine Translation Using the newly created dictionaries as well as the POSTagging information we intend to test several automated machine translation approaches of which we will outline the process and possible approaches in poster section three. Contributions for the Communities With our approaches we intend to contribute to the archaeological community in Germany by analysing Hittite cuneiform tablets. Together with work from the University of Heidelberg on image recognition of cuneiform tablets, we want to focus on creating a natural language processing pipeline from scanning cuneiform tablets to an available translation in English.

Einleitung und Motivation Semantische Extraktionsmechanismen (z.B. Topic Modelling) werden seit v... more Einleitung und Motivation Semantische Extraktionsmechanismen (z.B. Topic Modelling) werden seit vielen Jahren im Bereich des Semantic Web und Natural Language Processings sowie in den Digital Humanities als Verfahren zur Visualisierung und automatischen Kategorisierung von Dokumenten eingesetzt. Oft ergeben sich durch den Einsatz neue Aspekte der Interpretation von Dokumentensammlungen die vorher noch nicht ersichtlich waren. Als Beispiele solcher Verfahren kommen häufig Machine Learning Algorithmen zum Einsatz, welche eine Grobeinordnung von Texten vornehmen können. Gepaart mit Metadaten von Texten können anschließend beispielsweise thematische Übersichten von Dokumenten mit geographischem Bezug auf Kartenmaterialien in GIS Systemen oder mittels historischer Gazetteers zeitliche Zusammenhänge automatisiert dargestellt werden. In dieser Publikation möchten wir die Möglichkeiten der semantischen Extraktion nutzen und diese auf ei464 Digital Humanities im deutschsprachigen Raum 2018 ner Sammlung von Texten in Keilschriftsprachen anwenden. Keilschriftsprachen Keilschriftsprachen haben in den letzten Jahren ein größeres Interesse in der Digital Humanities und Linguistik Community erfahren. (Inglese 2015, Homburg et. al. 2016, Homburg 2017, Sukhareva et. al. 2017). Neben der andauernden Standardisierung in Unicode werden unter anderem Part Of Speech Tagger und Mechanismen der automatisierten Übersetzung erprobt um Keilschrifttexte besser mit dem Computer zu erfassen und zu interpretieren. Desweiteren wurde die Erlernbarkeit der Keilschriftsprachen durch digitale Tools wie Eingabemethoden oder Karteikartenlernprogramme verbessert. (Homburg 2015) Trotz all der erreichten Fortschritte verbleiben jedoch zahlreiche Probleme bei der maschinellen Verarbeitung von Keilschriftsprachen, die unter anderem mit der geringen Verfügbarkeit annotierter Ressourcen und der fehlenden Verfügbarkeit maschinenlesbarer und semantisch sowie linguistisch annotierter Wörterbücher zusammenhängt. Diese Limitierungen hindern viele Natural Language Processing und semantische Extraktionsalgorithmen daran ein besseres Ergebnis zu erzielen. Wir möchten mit dieser Publikation einen Beitrag leisten diese Situation zu verbessern und stellen das "Semantic Dictionary for Ancient Languages" vor, welches ein Versuch ist durch Annotierung vorhandener in der Forschungscommunity anerkannter Wörterbuchressourcen mit Unicode Characters, Semantic Web Konzepten, etymologischen Daten, gemeinsamen Vokabularen und POSTags eine semantische Ressource in RDF für die Optimierung solcher Algorithmen auf Basis der Sprachen Hethitisch, Sumerisch und Akkadisch zu schaffen.Das Wörterbuch basiert auf dem Lemon-Standard, ein W3C Standard der es erlaubt ebenfalls multilinguale Resourcen abzubilden. So können Entwicklungen der Sprache und gemeinsame Vokabulare wie zum Beispiel Akkadogramme und Sumerogramme in Hethitisch mit erfasst werden. Semantisches Wörterbuch und Semantische Extraktion Wir testen die Performance des Wörterbuchs auf einer der größten Sammlungen von digitalen Keilschrifttexten, der CDLI, aus der wir repräsentative Texte in hethitischer, sumerischer und akkadischer Keilschrift aus verschiedenen Epochen extrahieren und mittels Machine Learning klassifizieren, sowie verschlagworten. Das Ergebnis der semantischen Extraktion ist eine Sammlung von Themen pro Keilschrifttafel, die sich wiederum in Überkategorien gruppieren lassen und in einen zeitlichen, sprachlichen, dialektischen, sowie örtlichen Kontext gestellt werden können. Anhand der verschiedenen Metadaten der CDLI war es uns möglich eine thematische Karte der Fundorte der Keilschrifttafeln sowie deren Inhalt pro Epoche darzustellen aus der das relevante Fachpublikum schließen kann welche Themen zu welcher Zeit an welchem Fundort relevant für die Schreiber der jeweiligen Epoche waren. Im Zuge einer Weiterentwicklung möchten wir diese Informationen mit weiteren Metadaten wie beispielsweise der Jurisdiktion, den Daten der jeweiligen Herrscher sowie rekonstruierten Orten aus der antiken Zeit vervollständigen um Rückschlüsse auf interessante historische Ereignisse zu ziehen. Aufbau des Posters Auf unserem Poster möchten wir gerne den Prozess des Aufbaus, sowie die Struktur des semantischen Wörterbuchs sowie die Karte die durch unsere semantische Extraktion entstanden ist präsentieren um die jeweiligen Fachwissenschaftler zur Diskussion über die Entwicklung eines Semantic Web von Keilschriftsprachen und Keilschriftartefakten einzuladen. Desweiteren soll unser Poster eine Reihe von Anwendungen demonstrieren die sich in Zukunft mit unserer semantischen Ressource entwickeln lassen können um einen Beitrag zu einem hoffentlich zukünftig existierenden LinkedData Datensatz für Keilschriftartefakte zur Dokumentation von Keilschrift zu leisten.
Open geospatial datasources like OpenStreetMap are created by a community of mappers of different... more Open geospatial datasources like OpenStreetMap are created by a community of mappers of different experience and with different equipment available. It is therefore important to assess the quality of OpenStreetMap-like maps to give recommendations for users in which situations a map is suitable for their needs. In this work we want to use already defined ways to assess the quality of geospatial data and apply them a features to various Machine Learning algorithms to classify which areas are likely to change in future revisions of the map. In a next step we intend to qualify the changes detected by the algorithm and try to find causes of the changes being tracked.
articles by Timo Homburg

GeoSPARQL is an important standard for the geospatial linked data community, given that it define... more GeoSPARQL is an important standard for the geospatial linked data community, given that it defines a vocabulary for representing geospatial data in RDF, defines an extension to SPARQL for processing geospatial data, and provides support for both qualitative and quantitative spatial reasoning. However, what the community is missing is a comprehensive and objective way to measure the extent of GeoSPARQL support in GeoSPARQL-enabled RDF triplestores. To fill this gap, we developed the GeoSPARQL compliance benchmark. We propose a series of tests that check for the compliance of RDF triplestores with the GeoSPARQL standard, in order to test how many of the requirements outlined in the standard a tested system supports. This topic is of concern because the support of GeoSPARQL varies greatly between different triplestore implementations, and the extent of support is of great importance for different users. In order to showcase the benchmark and its applicability, we present a comparison of the benchmark results of several triplestores, providing an insight into their current GeoSPARQL support and the overall GeoSPARQL support in the geospatial linked data domain
Checking the compliance of geospatial triplestores with the GeoSPARQL standard represents a cruci... more Checking the compliance of geospatial triplestores with the GeoSPARQL standard represents a crucial step for many users when selecting the appropriate storage solution. This publication presents the software which comprises the GeoSPARQL compliance benchmark – a benchmark which checks RDF triplestores for compliance with the requirements of the GeoSPARQL standard. Users can execute this benchmark within the HOBBIT benchmarking platform to quantify the extent to which the GeoSPARQL standard is implemented within the triplestore of interest. This enables users to make an informed decision when choosing an RDF storage solution and helps assess the general state of adoption of geospatial technologies on the Semantic Web.
In this publication we present results of a comparative study of Wikidata and OpenStreetMap (OSM)... more In this publication we present results of a comparative study of Wikidata and OpenStreetMap (OSM) in the area of Germany, Austria and Switzerland. We include metadata of OSM and Wikidata, compare the two datasets on an object-by-object basis and on equivalent properties as defined by the respective communities. Our results give an indication about the tag coverage of the respective countries, which objects are typically associated with a wikidata tag, which mistakes are commonly made when annotating OSM objects with wikidata and the equality and equivalence of the respective Wikidata and OSM objects.
Uploads
inproceedings by Timo Homburg
inbooks by Timo Homburg
phdtheses by Timo Homburg
miscs by Timo Homburg
articles by Timo Homburg
Klassifikation von Steinen (Familienhierarchie, Namensbeschreibung etc.)
Visualisierung von Zusammenhängen (Verwandtschaftsbeziehungen, Stammesgrenzen) aus Linked Data generierten Karten
Formale Erfassung und maschinenlesbare Kodierung von Ogham-Zeichen nach dem Vorbild von PaleoCodage (Homburg 2019)
Als Datenbasis für die Analysen stützen wir uns auf eine Wikidata-Retrodigitalisierung des CIIC Corpus von Macálister (1945,1949), Epidoc-Daten des Ogham in 3D Projekts, sowie auf die Celtic Inscribed Stones Project (CISP2) Datenbank, die uns dankenswerterweise von Dr. Kris Lockyear zur Verfügung gestellt wurde. Des Weiteren pflegen wir aktiv fehlende und passende Elemente in Wikidata ein, um so später die Daten der Research Community im Sinne des SPARQL Unicorn (Thiery and Trognitz 2019a, 2019b) bereitzustellen. Der Sourcecode unserer App steht quelloffen auf GitHub zur Verfügung (Homburg & Thiery 2019).
The conference will be held on October 15-17, in the World Wide Web. More information is available at https://2020.archeofoss.org