Data Quality Dimensions

Internet of Things—Applications and Future, 2020

Abstract

Data quality dimension is a term to identify quality measure that is related to many data elements including attribute, record, table, system or more abstract groupings such as business unit, company or product range. This paper presents a thorough analysis of three data quality dimensions which are completeness, relevance, and duplication. Besides; it covers all commonly used techniques for each dimension. Regarding completeness; Predictive value imputation, distribution-based imputation, KNN, and more methods are investigated. Moreover; relevance dimension is explored via filter and wrapper approach, rough set theory, hybrid feature selection, and other techniques. Duplication is investigated throughout many techniques such as; K-medoids, standard duplicate elimination algorithm, online record matching, and sorted blocks.

Menna Ibrahim Gabr hasn't uploaded this paper.

Create a free Academia account to let Menna Ibrahim know you want this paper to be uploaded.

Ask for this paper to be uploaded.

Log In

Data Quality Dimensions

Related topics