Data Quality (Computer Science)
11,256 Followers
Recent papers in Data Quality (Computer Science)
Data quality is crucial in all information systems. As a key step in obtaining clean data, record linkage or entity resolution (ER) groups database records by the underlying real world entities. In this pa- per we give practical... more
OpenStreetMap (OSM) is a free map of the world which can be edited by global volunteers. Existing studies have showed that completeness of OSM road data in some developing countries (e.g. China) is much lower, resulting in concern in... more
Victims of road traffic accidents face severe health problems on-site or after the event when they arrive at hospital lately in their emergency cycle. Road traffic accident has negative effect on the physical, social and emotional... more
International Conference on Data Science and Applications (DSA 2020) will act as a major forum for the presentation of innovative ideas, approaches, developments, and research projects in the areas of Data Science and Applications. It... more
Companies desires for making productive discoveries from big data have motivated academic institutions offering variety of different data science (DS) programs, in order to increases their graduates' ability to be data scientists who are... more
Victims of road traffic accidents face severe health problems on-site or after the event when they arrive at hospital lately in their emergency cycle. Road traffic accident has negative effect on the physical, social and emotional... more
Victims of road traffic accidents face severe health problems on-site or after the event when they arrive at hospital lately in their emergency cycle. Road traffic accident has negative effect on the physical, social and emotional... more
Training Report on Machine Learning
This paper is an Editorial for the Special Issue titled “OpenStreetMap as a multidisciplinary nexus: perspectives, practices and procedures”. The Special Issue is largely based on the talks presented in the 2019 and 2020 editions of the... more
In a swiftly transform global business environment, make accurate and appropriate decisions by taking advantage of business intelligence analytics is the demand on organizations. The ability to recognize challenges, spot opportunities,... more
Data quality (DQ) assessment and improvement in larger information systems would often not be feasible without using suitable " DQ methods " , which are algorithms that can be automatically executed by computer systems to detect and/or... more
Data is the most valuable asset companies are proud of. When its quality degrades, the consequences are unpredictable and can lead to complete wrong insights. In Big Data context, evaluating the data quality is challenging and must be... more
Today's largest and fastest growing companies' assets are no longer physical, but rather digital (software, algorithms. . .). This is all the more true in the manufacturing, and particularly in the maintenance sector where quality of... more
As the world is catching up with various technological developments, blockchain technology slowly catches the Malaysian attentions. In Malaysia, this technology is still in its infancy stage and is yet to gain popularity. This research is... more
The question of data reliability is of first importance to assess the quality of manually annotated corpora. Although Cohen ' s κ is the prevailing reliability measure used in NLP, alternative statistics have been proposed. This paper... more
Autonomy Heterogeneity no yes totally semi DIS DW & MIS VMS CIS RS P2P no
OpenStreetMap (OSM) is currently an important source for building data, despite the existence of potential quality issues. Previous studies have assessed OSM data quality by comparing it with reference building data, which may not... more
Data quality drawn a major concern when dealing with data especially in the event that insightful outputs is needed. Research in data quality emerged in various topics and diversification in known knowledge and used approach is... more
In this paper we have discussed the problems of data quality which are addressed during data cleaning phase. Data cleaning is one of the important processes during ETL. Data cleaning is especially required when integrating heterogeneous... more
Laser scanning is a promising geometric data collection tool for construction, facility, and infrastructure management due to its fast sampling rate (tens of thousands of measurements per second) and millimeter-level accuracy. However,... more
Ontologies are important knowledge representation and sharing tools in the workflow of biology research. The high cost of creating and maintaining ontologies encourages their sharing and reuse, and an increasing number of ontologies have... more
Erhebung der aktuellen Situation zur Stammdatenqualität in Unternehmen und daraus abgeleitete Trends Thomas Schäffer, Helmut Beckmann 2014 | Broschiert, fbg. | 82 S., dt. ISBN 978-3-95663-008-8 (Print) ISBN 978-3-95663-009-5 (E-Book)... more
This paper reflects on six years developing semantic data quality tools and curation systems for both large-scale social sciences data collection and a major web of data hub. This experience has led the author to believe in using... more
In lots of large digital collections the only way a user could access a digital object is via its metadata. If metadata is not precise, contains inappropriate or less information users miss the object, and data creators lost the energy... more
Mobile Crowd-Sensing (MCS) has appeared as a prospective solution for large-scale data collection, leveraging built-in sensors and social applications in mobile devices that enables a variety of Internet of Things (IoT) services. However,... more
In last few years there are many changes and evolution has been done on the classification of data. Application area of technology increases then the size of data also increases. Classification of data becomes difficult because of... more
"With this PhD dissertation, I intend to contribute to a better understanding of the Wiki phenomenon as a knowledge management system which aggregates private knowledge. I also wish to check to what extent information generated through... more
Victims of road traffic accidents face severe health problems on-site or after the event when they arrive at hospital lately in their emergency cycle. Road traffic accident has negative effect on the physical, social and emotional... more
Without robust quality control, insights gained by exploiting linked data will be incomplete, unreliable and misleading. Much of what passes for quality control, however, is not designed to verify complete end-to-end real world processes... more
Background: The ubiquity of health wearables and the consequent production of patient-generated health data (PGHD) are rapidly escalating. However, the utilization of PGHD in routine clinical practices is still low because of data quality... more
Tackling sample bias, Non-Response Bias, Cognitive Bias, and disparate impact associated with Protected Categories in three parts/papers, data, algorithm construction, and output impact. This paper covers the Data section.
Open data are a basic infrastructure for the creation of business, products, and services. To analyze their usefulness, the dif- ferences between data access, and dissemination, and reuse must be taken into account. The objective of this... more
Environmental policy involving citizen science (CS) is of growing interest. In support of this open data stream, validation or quality assessment of the CS data and their appropriate usage for evidence-based policy making, needs a... more
Executive Summary/Abstract: In an effort to improve the data quality levels of applications developed by McKesson Specialty Patient Services, the Quality Assurance team has implemented a data quality approach to system testing. The QA... more
Within the context of the smart city, data are an integral part of the digital economy and are used as input for decision making, policy formation, and to inform citizens, city managers and commercial organisations. Reflecting on our... more
Dieses Kapitel gibt einen Überblick über das Datenqualitätsmanagement. Es listet einige Ansätze zum Thema und seine grundlegenden Definitionen auf. Die Datenqualität hängt immer vom Kontext und Zweck der Daten ab, daher haben... more
Without robust quality control, insights gained by exploiting linked data will be incomplete, unreliable and misleading. Much of what passes for quality control, however, is not designed to verify complete end-to-end real world processes... more
Victims of road traffic accidents face severe health problems on-site or after the event when they arrive at hospital lately in their emergency cycle. Road traffic accident has negative effect on the physical, social and emotional... more
The buildup of huge data within business intelligence is essential because such data includes complete conceptual and technological stack in addition to raw and processed data, data management, and analytics. Evaluating Data Quality Model... more
Incomplete data is a crucial challenge to data exploration, analytics, and visualization recommendation. Incomplete data would distort the analysis and reduce the benefits of any data-driven approach leading to poor and misleading... more
Open access (OA) has mostly been studied by relying on publication data from selective international databases, notably Web of Science (WoS) and Scopus. The aim of our study is to show that it is possible to achieve a national estimate of... more