Interactive Data Cleaning for Real-Time Streaming Applications

Timo Räth; Ngozichukwuka Onah; Kai-Uwe Sattler

Interactive Data Cleaning for Real-Time Streaming Applications

Ngozichukwuka Onah

Proceedings of the Workshop on Human-In-the-Loop Data Analytics

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

The importance of data cleaning systems has continuously grown in recent years. Especially for real-time streaming applications, it is crucial, to identify and possibly remove anomalies in the data on the fly before further processing. The main challenge however lies in the construction of an appropriate data cleaning pipeline, which is complicated by the dynamic nature of streaming applications. To simplify this process and help data scientists to explore and understand the incoming data, we propose an interactive data cleaning system for streaming applications. In this paper, we list requirements for such a system and present our implementation to overcome the stated issues. Our demonstration shows, how a data cleaning pipeline can be interactively created, executed, and monitored at runtime. We also present several different tools, such as the automated advisor and the adaptive visualizer, that engage the user in the data cleaning process and help them understand the behavior of the pipeline. CCS CONCEPTS • Information systems → Data cleaning.

Helena Galhardas

2010

Data cleaning and Extract-Transform-Load processes are usually modeled as graphs of data transformations. These graphs typically involve a large number of data transformations, and must handle large amounts of data. The involvement of the users responsible for executing the corresponding programs over real data is important to tune data transformations and to manually correct data items that cannot be treated automatically.

Log In

Interactive Data Cleaning for Real-Time Streaming Applications

Sign up for access to the world's latest research

Abstract

Related papers