Font Size: a A A

Genetic algorithms for data mining and multivariate data analysis

Posted on:2004-12-08Degree:Ph.DType:Dissertation
University:Clarkson UniversityCandidate:Davidson, Charles EarlFull Text:PDF
GTID:1468390011976574Subject:Chemistry
Abstract/Summary:
Pattern recognition is a challenging problem in the analysis of multivariate data. Feature selection, the process of identifying the most informative variables, is a crucial step in any pattern recognition study. The selection of an appropriate feature subset can simplify the classification and lead to improved results. Feature selection, however, is itself non-trivial.; The adaptation of a genetic algorithm (GA) to tackle feature selection is presented in this dissertation. Genetic algorithms are a class of search methods that utilize evolutionary principles to optimize an observable quantity. The attributes of a genetic search strategy towards selecting the best feature subset can potentially overcome the difficulties inherent in feature selection.; Application of the feature selection GA to a wide range of chemical problems (involving chromatograpy, spectroscopy, genomics, proteomics and structure-acivity relationships) demonstrates the utility of the method. Analyses with the GA consistently out-perform traditional methods, emphasizing the importance of multivariate feature selection in pattern recognition problems.
Keywords/Search Tags:Feature selection, Multivariate, Data, Genetic, Recognition
Related items