Research On Data Quality Evaluation Model And Evaluation Tool

With the accumulation of data in the era of Internet and large data,more and more areas are focusing on data quality issues.Although for all aspects of data quality problems,including data integration,data cleaning,similar records detection,data quality control and management,data quality evaluation,a lot of academic research and practical applications have been explored in the industry,however,there is a lack of a relatively systematic evaluation index system and a lack of evaluation model established from the user's perspective,moreover the data quality assessment work mostly stays in the qualitative stage,and it is difficult to achieve accurate quantitative evaluation.Therefore,the data quality evaluation system has obvious deficiency in both feasibility and effectiveness.This paper studies the research progress of data quality of domestic and foreign institutions and the research status of data quality assessment in academic literature,summing up the standard of data quality management and data quality evaluation process.Based on ISO/IEC25024 data quality model,combined with the characteristics of data sets,used grey correlation degree and correlation coefficient,we established index ranking model,correlation analysis of indicators,seven indicators that most influence the quality of the data.Then through the factor analysis on correlation of these seven indicators,three the factor set consisting of data quality evaluation index system was obtained.From the perspective of users,two evaluation models are set up including single factor set data quality accurate assessment model and data quality comprehensive evaluation model.The single factor data quality accurate evaluation model is mainly to evaluate multiple evaluation objects and single factor sets.Data quality comprehensive evaluation model is aimed at multiple rule series,and comprehensively evaluate the data quality of a single assessment object.The data quality evaluation management tools are developed based on the two models,and the automatic quantitative evaluation of the data quality is realized.At the end of this paper,according to the data quality assessment models proposed,the data quality of a research institute business management system is evaluated by data quality assessment and management tools.The single factor data quality evaluation model is used to evaluate a single factor set of each evaluation object,and the evaluation results are visualized by the visualization module of the evaluation results,and the data quality of the single factor of each evaluation object is analyzed and compared.A comprehensive assessment of data quality of a business system is finished according to the comprehensive evaluation model of data quality.Tests and experiments validate the feasibility and effectiveness of evaluation models and also the accuracy and efficiency of management tools for automated testing.
Keywords/Search Tags:data quality assessment, evaluation index system, evaluation models, automation assessment, data visualization
