Linking Semistructured Data on the Web

Ulrich Germann

Linking Semistructured Data on the Web

Ulrich Germann

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Many Web data sources and APIs make their data available in XML, JSON, or a domain-specific semi-structured format, with the goal of making the data easily accessible and usable by Web application developers. Although such data formats are more machine-processable than pure text documents, managing and analyzing such data in large scale is often nontrivial. This is mainly due to the lack of a well-defined (or understood) structure and clear semantics in such data formats, which could result in poor data quality. In the xCurator project, we add structure to such data with the goal of publishing it on the Web as Linked Data. We enhance the quality of such data by: extracting entities, their types, and their relationships to other entities; performing entity (and entity type) identification; merging duplicate entities (and entity types); linking related entities (internally and to external sources); and publishing the results on the Web as high-quality Linked Data. This is all in a light-weight easy-to-use and scalable framework that effectively incorporates user feedback in all phases. We describe the initial framework of our system and report the results of using our system for managing large volumes of (user-generated) data on the Web in several real world applications.

Patricia Serrano-alvarado

2019

Linked Data (LD) is a set of best practices to publish data in RDF format. Transforming structured datasets into RDF datasets is possible thanks to RDF Mappings. To be able to define such mappings, it is necessary to be familiar with the LD practices and to know perfectly concerned datasets. An obstacle to the democratisation of the LD is that few people satisfy these two conditions.We believe that tools making easy the process of LD integration will foster the LD growth. In this demonstration, we present a chatbot-like tool that can semi-automatically generate RDF mappings for existing structured datasets. The challenge is to automate part of the integration process that requires getting familiar with RDF.

Log In

Linking Semistructured Data on the Web

Sign up for access to the world's latest research

Abstract

Related papers

Related topics

Related papers