Font Size: a A A

The Study Of Data Quality Assessment Methods

Posted on:2016-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y LingFull Text:PDF
GTID:2180330461986604Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Statistical data which is regarded as the lifeline of statistical work is the basis of statistical study. Its quality level influences the scientificity of statistical analysis and the correctness of statistical decision directly. Therefore, it is necessary to research quality assess-ment of statistical data systematically and deeply to meet the need of statistical data required by social development and enhance the authenticity and comparability of statistical data.Based on the theory of existing assessment methods of data quality, we studies in two aspects:On the one hand, it analyzes and improves assessment methods from the angle of outlier diagnosis. Firstly, on the basis of the theories of Grubbs Test Method and T Test Method, we use Pitman Asymptotic Relative Efficiency to compare the advantages and disadvantages of these two outlier tests. What this leads us to is that Grubbs Test Method is better than T Test Method. Secondly, we analyze the limitations of cross-validation theory and presents a new assessment method of data quality which results from improved cross-validation. In this procedure, it selects domestic water consumption data in Chengdu to make an empirical analysis. Finally, extended from the classical BP neural network theory, assessment method concerned about that degraded ceiling algorithm is used to improve the BP neural network is presented. To prove this, we select the stock data of Shanghai index in 2013 to achieve the work.On the other hand, on the point of analyzing the reliability of assessment results, we study the methods of quality assessment of statistical data. It chooses the property of research objects as assessed objects, and the property is reflected by statistical data synthetically. Assessment method based on principal component analysis-rank sum ratio is showed in the paper. The statistical data of 21 cities of Sichuan province in 2010 are chosen to evaluate the new urbanization level around the cities objectively and rationally. Then, according the reliability of evaluation results, we assesses the reliability of the data quality.
Keywords/Search Tags:Data Quality Assessment, Pitman Asymptotic Relative Efficiency, Cross- Validation, Back Propagation neural network, The Degraded Ceiling Algorithm, Principal Component Analysis-Rank Sum Ratio
PDF Full Text Request
Related items