Font Size: a A A

Data Quality Assessment And Improvement In Data Acquisition

Posted on:2018-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:S H ZengFull Text:PDF
GTID:2428330623950815Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In the data collection process,there are inevitably data problems,missing data and other issues.Data exists because of its value.Some data is used as evidence.Some data is used for analysis and prediction.No matter what the data is,if the data is wrong or has other problems,it will reduce the value of the data,and sometimes even bring deep disaster.However,the severity of data quality problems in data sets is often unknown.This paper aims to sort out the literature related to data quality,summarize the research results in this area,focus on the data quality evaluation method,put forward the corresponding method to improve the quality of data,and finally Establish a complete data quality evaluation and control model.The main work of this paper includes the following aspects:(1)Proposed a quantitative data quality assessment model.This paper is devoted to a quantitative data quality assessment model.Based on the data problems,it select the integrity,consistency,accuracy,timeliness,normality of these dimensions to the quality of data,each of which made a specific assessment algorithm.In order to make an overall evaluation of the object under the influence of many factors,this paper introducd the fuzzy comprehensive evaluation method into the data quality evaluation.It includes evaluation target,evaluation object,influencing factor,evaluation method,evaluation level,index weight and evaluation result.Fuzzy comprehensive evaluation method can combine qualitative evaluation into quantitative evaluation,as well as making an overall evaluation of the object under the influence of many factors.The result of the fuzzy comprehensive evaluation is clear,and can solve the uncertain problems of fuzzy and hard to quantify.(2)In order to solve the quality problems in data collection,this paper proposed a data quality improvement algorithm based on dependency and rules to improve the consistency,accuracy,completeness,timeliness and standardization of data quality.In addition,the data quality promotion method based on entity recognition is proposed.It separated the decisive field according to the importance of the field in the record and improved the attribute similarity algorithm in promotion method.Based on it,a complete set of data quality evaluation and control system has been established.
Keywords/Search Tags:data quality, data quality assessment, data acquisition, data-quality improvement, Total Data Quality Management(TDQM)
PDF Full Text Request
Related items