Research On Feature Selection For Pattern Classification

Posted on:2010-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:W Y Sun

Full Text:PDF

GTID:2178360278966961

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

Feature selection plays an important role in data analysis and pre-processing steps. It can eliminate both irrelevant and redundant information, and reduce the dimension of training samples and complexity of algorithm and escap the noise disturbance. In the result, the generalization performance and classification precision of model would have been effectively improved. According to its principle, a feature selection process can be seen as a combinatorial optimization process: selecting a subset of features to optimize a certain evaluation criterion.Firstly, the feature selection frame include four steps: a generation procedure to generate the next candidate subset, an evaluation function to evaluate the subset under examination, a stopping criterion to decide when to stop and a validation procedure to check whether the subset is valid. Search strategies and evaluation functions are summarized based on the frame in the thesis.Secondly, several search strategies are studied in the paper, for example: the Branch&Bound algorithm, the Sequential selection algorithm, plus l takeaway r, the floating search algorithm. Based on the evaluation of inter-intra distance, all the performances of the search algorithms are compared on the same dataset, verified the theoretical analysis.Thirdly, the mutual information for feature selection is introduced in detail and its computation based on non-parametric density estimation is described. A mutual information based feature selection algorithm is implemented on several artificial and real datasets, its experiments results is analysed. At the same time, the mutual information criteria and other criteria were made a comparison.Finally, the paper studied the features of relevance and redundancy. According to the correlations between features and class labels and between features, a method for feature selection based on the correlation analysis is proposed, and it can greatly reduce the dimension of feature space, reducing the computational complexity.

Keywords/Search Tags:

feature selection, search strategy, evaluation measure, mutual information, relevance

PDF Full Text Request

Related items

1	Research On Multi-label Feature Selection Algorithms Based On Random Search Strategy
2	Feature Selection Research Based On Maximum Relevance Minimum Redundancy
3	Research On Measures And Models In Feature Selection
4	The Study Of Some Issues For Unsupervised And Semi-supervised Dimensionality Reduction
5	Research On Filter Feature Selection Algorithm
6	Research On Feature Selection Algorithm Based On Mutual Information
7	Research On Dynamic Feature Selection Algorithm Based On Mutual Information
8	Research On Mutual Information Based Feature Selection Method For High Dimensional Small Sample Data
9	Research On Information-theoretical-based Multi-label Feature Selection Approach
10	Research On Mutual Information Based Feature Selection Algorithm