Academia.eduAcademia.edu

2 DI/gEPL – Languages Specification and Processing Group

2008

Abstract

Abstract. In today’s society the exploration of one or more databases to extract information or knowledge to support management is a critical success factor for an organization. However, it is well known that several problems can affect data quality. These problems have a negative effect in the results extracted from data, influencing their correction and validity. In this context, it is quite important to understand theoretically and in practice these data problems. This paper presents a taxonomy of data quality problems, derived from real-world databases. The taxonomy organizes the problems at different levels of abstraction. Methods to detect data quality problems represented as binary trees are also proposed for each abstraction level. The paper also compares this taxonomy with others already proposed in the literature. 1.