Font Size: a A A

The Research On Discriminant Analysis Of Interval Symbolic Data

Posted on:2011-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:X WeiFull Text:PDF
GTID:2120330338481488Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Traditional discriminant analysis aims at dealing with point data. When processing massive data, it is hard for discriminant analysis to hold the date attribute on the whole perfectly. With the help of "data package ", symbolic data analysis can hold the inner relationship within massive data. Based on summarizing the existing theory of interval data, this paper three discriminant analysis of the interval symbolic data is studied, treating the internal symbolic data in general distribution as study object.Fist, integrate the problem of symbolic interval data standardization. Based on Hausdorff distance, the distance between one interval data and the center point of each group is given. After these, the distance discriminant analysis for interval symbolic data is put forward and the special steps of the method is presented. Second, based on the existing literature,this paper sorted the interval linear combination method. Then the interval data variation is successfully decomposed into two parts, representing the differences between the groups and within the group.On this basis, this paper extends traditional linear discriminant method and puts forward Fisher Criterion for symbolic data in general distribution. Third, on the basis of sino-foreign literature, the paper discussed kernel density estimation of symbolic data. Then non-parameter identification method is extendsed and the maximum likelihood method and Bayes Criterion method for symbolic data are put forward. Last, aiming at the empirical analysis of rain forecast,29 representational cities in northeast China has selected as symbolic object, and temperature, cloudage, wind speed of May 4th 2010 are taken as indicator variables, then we get interval symbolic data, afterward distance discriminant analysis and Fisher Criterion are used to forecast if it will rain the next day. Comparing the three dicriminant methods, we found that:The distance discriminant method can be used no matter what distribution the internal symbolic data is, but some information of the data is lost when the method is used. The result of Fisher discriminant analysis is easy to explain, but the distribution the internal symbolic data is needed. Without knowing the distribution of the internal symbolic data, the nonparametric discriminant method based on the kernel estimation can be applied. But in order to get accurate density estimation, the great amount of sample is required.In the paper, traditional discriminant analysis method is developed to a higher level which can be applied to discriminate the interval data. And applied research also shows that these methods are feasible and practicable.
Keywords/Search Tags:Interval symbolic data, distance discriminant analysis, Fisher discriminant analysis, kernel discriminant analysis
PDF Full Text Request
Related items