Font Size: a A A

Data Mining And Graphics Mode On Influential Point

Posted on:2003-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhangFull Text:PDF
GTID:2120360092965821Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the wide application of data mining to modern business, the researches of data mining for outlier and influential point have been paid close attention to by economic and statistical circles. Though both data mining and statistical diagnostics have only fifty-year history, a lot of achievements have been made. However, there are many problems remaining unsolved.Based on the analysis of internal and international research works related to influential point and exploratory data analysis, two new approaches are presented in this paper to deal with the data mining of influential point, namely, relationship-based warp-departure analysis and contribution-score dimension reduction analysis.The main works and conclusions in this paper are listed below:· Relationship-based warp-departure analysis: First, we compute the warp coefficient and departure coefficient according to the method of relationship analysis. Then, warp-departure degree, the product of the two coefficients, is used to decide which is the influential point. Meanwhile, the method is applied to several typical examples, the analytical and numerical results show that: (1) Comparing with classical diagnosis method, the conclusions about influential point are the same. (2)The approach is adaptive to the case with small sample number, say, any integer larger than 3.(3)The method is of lower cost in computation, the computational complexity is 0(). · Contribution-score dimension reduction analysis: The contribution-score which is obtained from the principal component analysis, is used to reduce the dimensions of data. Then the influential distance is employed to decide influential point by sample data removing. The computational results from some typical examples show that: (1) Analyzing the fore-and-aft influential distance and Cook-distance, the points with first largest distance are unchanged, this results that the dimension reduction method is acceptable. (2) Comparing the influential distance method with the classical analysis method-Cook distance method, the conclusions are in accord on influential point, it results that the influential distance method is acceptable. (3) Graphics mode of influential points is available via dimension reduction.· A data mining application system is developed to diagnose influential point.
Keywords/Search Tags:influential point, data mining, diagnosis, graphics mode, warp-departure degree, dimension reduction, influential distance, Cook-distance
PDF Full Text Request
Related items