Font Size: a A A

The Research Of Dimensionality Reduction In Pattern Recognition Classification

Posted on:2010-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:K RenFull Text:PDF
GTID:2178360278974914Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Classification is the principal task of pattern recognition using the features of the patterns. Pattern recognition always needs to process large numbers of high dimensional data. On the one hand, the higher dimensionality the data has, the more information the data can take. On the other hand, the data with high dimensionality is not only hard to process, but also lower the classification accuracy owing to the relativity and redundancy of the features, and may call Curse of Dimensionality. So dimensionality reduction is the principal task of pattern recognition.Feature extraction and feature selection are principal methods of dimensionality reduction. Feature ranking is a branch of feature selection. The Computational complexity is low and easy to application. It can rank the feature for one time and is used widely. This paper proposes a feature ranking method to resolve the problem of dimensionality reduction. In this paper, the relevant information of reducing feature's dimensionalities is reviewed and probability density estimation is introduced, and the principle of non-parameter estimation and Parzen window probability density estimation is elaborated. In this paper, a method of the Gaussian kernel Parzen window probability density estimation is introduced and applied. And then, we used the method to process some Unsupervised date and supervised date. In the approach of dealing with Unsupervised data, we work out the original dataset's score of probability density estimation first. Then we let one of the dataset's feature be weighted. So the score of probability density estimation will be changed. The difference of these two scores is the result of this method. Base on the result, the features can be ranked. Then the purpose of dimensionality reduction would be taken by choose a proper dimension. In the approach of dealing with supervised data, one of the dataset's feature is been weighted first, then calculate the probability density interval between classes. The most important feature will result the biggest distances between classes. Therefore, we use probability density interval of inter-class to rank feature. This paper elaborates the derivation process and algorithm steps of this method. The algorithm proposed is realized by MATLAB. We used some UCI machine learning datasets to take experiments. The results of our approach demonstrate that the method is effectively and gain the advantage over others.
Keywords/Search Tags:dimensionality reduction, feature selection, feature ranking, probability density interval, Parzen window probability density estimation
PDF Full Text Request
Related items