Iterative feature weighting for identification of relevant features in machine learning: With multilayer perceptron, radial basis function and support vector architectures

Posted on:2006-04-03

Degree:Ph.D

Type:Dissertation

University:Case Western Reserve University

Candidate:Duan, Baofu

Full Text:PDF

GTID:1458390008951167

Subject:Engineering

Abstract/Summary:

In multivariate data analysis, samples may be described in terms of many features, but in specific tasks some features may be redundant or irrelevant, serving primarily as sources of noise and confusion. The irrelevant and redundant features not only increase the cost of data collection, but may also be the reason why machine learning is often hampered by lack of an adequate number of samples.; Feature selection can be used to address the issue by identifying and selecting only those features that are relevant to the specific task in question. An alternate approach is feature weighting which assigns continuous-valued weights to each and all the features used in the description of data samples. Feature weighting can help reduce the effect of irrelevant features by assigning smaller weights to them and larger weights to relevant features.; In this dissertation, we study the effect of irrelevant features on neural network design and propose a framework for iterative feature weighting with neural networks. The framework iteratively improves the trained neural networks until reaching the optimal network model. On the other hand, feature weights are evaluated through trained neural networks and hence they converge to the optimal solutions as well. We present a convergence theorem to guide the design of the framework and then implement the framework for three typical neural network architectures. The implementations of these iterative feature weighting methods are applied to locally synthesized data and to benchmark datasets, and good results have been obtained. Results for the MONK's problems show that these methods are very effective in identifying relevant features that have complex logical relationships in data. Results for the Boston housing data show that the performances of regression models can be improved through iterative feature weighting. Results for the Leukemia gene expression data show that these methods can be used not only to improve the accuracy of pattern classification, but also to identify features that may have subtle nonlinear correlation to the task in question.

Keywords/Search Tags:

Features, Data

Related items

1	Research On CCTV's "Number Theory" Data News Reporting
2	Research On Spatio-temporal Data Visualization Based On Spatial Features Mining
3	Extrema features for global-localization and pattern matching of time-series data
4	Dimensional measurement of conical features using coordinate metrology
5	Research On Multi-Features Link Prediction Based On Matrix
6	Research On Multi-features Link Prediction Based On Matrix
7	Dual Autoencoders Features For Imbalance Classification Problems
8	GPR methods for the detection and characterization of fractures and karst features: Polarimetry, attribute extraction, inverse modeling and data mining techniques
9	Mental State Perception Based On Student Behavior Data
10	Research Of Shoe Soles Images Retrieval Based On Shapes Features