Font Size: a A A

A novel approach to data mining: Genetic algorithm for feature selection

Posted on:2008-03-24Degree:Ph.DType:Dissertation
University:Clarkson UniversityCandidate:Vora, Mehul NFull Text:PDF
GTID:1448390005451980Subject:Mathematics
Abstract/Summary:
This dissertation presents the Genetic Algorithm (GA) as a data microscope for sorting, probing and finding uncovered relationships in multivariate data. Identifying a relationship or a pattern in a multivariate dataset is a challenging problem. Sometimes relationships can not be expressed in quantitative terms. These relationships are better expressed in terms of similarity and dissimilarity among groups of multivariate data. Feature selection, the process of identifying the most informative features, is a crucial step in any data mining and pattern recognition study. The selection of an appropriate feature subset can simplify the problem and lead to improved results. Feature selection, however, is itself non-trivial.; The pattern recognition GA identifies a subset of features whose variance or information is primarily about differences between the groups in a data set. The attributes of a genetic search strategy towards selecting the best feature subset can potentially overcome the difficulties inherent in feature selection. Application of the pattern recognition GA to a wide range of problems from the field of chemometrics and bioinformatics demonstrates the utility of the method.
Keywords/Search Tags:Data, Feature selection, Genetic
Related items