Analysis of Data Quality Issues in Real-world Industrial Data

Thomas Hubauer, Steffen Lamparter, Mikhail Roshchin, Nina Solomakhina, and Stuart Watson
Submission Type: 
Full Paper
phmc_13_067.pdf686.56 KBSeptember 24, 2013 - 4:16am

In large industries usage of advanced technological methods and modern equipment comes with the problem of storing, interpreting and analyzing huge amount of information. Handling information becomes more complicated and important at the same time. So, data quality is one of major challenges considering a rapid growth of information, fragmentation of information systems, incorrect data formatting and other issues. The aim of this paper is to describe industrial data processing and analytics on the real-world use case. The most crucial data quality issues are described, examined and classified in terms of Data Quality Dimensions. Factual industrial information supports and illustrates each encountered data deficiency. In addition, we describe methods for elimination data quality issues and data analysis techniques, which are applied after cleaning data procedure. In addition, an approach to address data quality problems in large-scale industrial datasets is
proposed. This techniques and methods comprise several well-known techniques, which come from both worlds of mathematical logic and also statistics, improving data
quality procedure and cleaning results.

Publication Year: 
Publication Volume: 
Publication Control Number: 
Page Count: 
Submission Keywords: 
Data quality in industry; Data Quality Dimensions
Submission Topic Areas: 
Data-driven methods for fault detection, diagnosis, and prognosis
Submitted by: 

follow us

PHM Society on Facebook Follow PHM Society on Twitter PHM Society on LinkedIn PHM Society RSS News Feed