Approximate summaries for why and why-not provenance

Bertram Ludaescher

Approximate summaries for why and why-not provenance

Bertram Ludaescher

2020, Proceedings of the VLDB Endowment

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Why and why-not provenance have been studied extensively in recent years. However, why-not provenance and --- to a lesser degree --- why provenance can be very large, resulting in severe scalability and usability challenges. We introduce a novel approximate summarization technique for provenance to address these challenges. Our approach uses patterns to encode why and why-not provenance concisely. We develop techniques for efficiently computing provenance summaries that balance informativeness, conciseness, and completeness. To achieve scalability, we integrate sampling techniques into provenance capture and summarization. Our approach is the first to both scale to large datasets and generate comprehensive and meaningful summaries.

Luc Moreau

2015

As users become confronted with a deluge of provenance data, dedicated techniques are required to make sense of this kind of information. We present Aggregation by Provenance Types, a provenance graph analysis that is capable of generating provenance graph summaries. It proceeds by converting provenance paths up to some length k to attributes, referred to as provenance types, and by grouping nodes that have the same provenance types. The summary also includes numeric values representing the frequency of nodes and edges in the original graph. A quantitative evaluation and a complexity analysis show that this technique is tractable; with small values of k, it can produce useful summaries and can help detect outliers. We illustrate how the generated summaries can further be used for conformance checking and visualization.

Log In

Approximate summaries for why and why-not provenance

Sign up for access to the world's latest research

Abstract

Related papers

Related topics