Feature selection and discriminant analysis in data minin

Posted on:2005-06-10

Degree:Ph.D

Type:Dissertation

University:University of Florida

Candidate:Youn, Eun Seog

Full Text:PDF

GTID:1458390011953006

Subject:Computer Science

Abstract/Summary:

The problem of feature selection is to find a subset of features for optimal classification. A critical part of feature selection is to rank features according to their importance for classification. This dissertation deals with feature ranking. The first method is based on support vectors (SVs). Support vectors refer to those sample vectors that lie around the boundary between two different classes. We show how we can do feature ranking with only support vectors. While SV-based feature ranking can be applied to any discriminant analysis, here we considered two linear discriminants, SVM and FLD. Feature ranking is based on the weight associated with each feature or determined by recursive feature elimination (RFE). Both schemes are shown to be effective in various domains. The second method of feature ranking is based on naive Bayes. The naive Bayes classifier has been extensively used in text categorization. Despite its successful application to this problem, naive Bayes has not been used for feature selection. We have developed a new feature scaling method, called class-dependent-term-weighting (CDTW). A new feature selection method, CDTW-NB-RFE, combines CDTW and RFE. Our experiments showed that CDTW-NB-RFE outperformed any of the five popular feature ranking schemes used on text datasets. This method has also been extended to continuous data domains.

Keywords/Search Tags:

Feature, Method

Related items

1	Research On Chi-square Statistic Feature Selection Method And TF-IDF Feature Weighting Method For Chinese Text Classification
2	Study On The Extraction Method Of3D Facial Feature
3	Research And Realization Of Renminbi Clearing Method Based On Image Analysis
4	Research On Direct Method Stereo SLAM System Fusion Of Feature Point Information
5	The Improvement Of Isometric Feature Mapping Algorithm And Its Application To Web Chinese Text Categorization
6	Regional Shape Description Method And Its Application To Trademark Image Retrieval
7	Research On Vehicle Detection Method Based On HOG Feature Extraction
8	The Method Of Manufacturing Information Retrieval Form Design Models
9	Research On Heterochromatic Materials Recognition Method Based On Color Feature
10	Research On Industrial Process Fault Classification Based On Hybrid Method