Font Size: a A A

Pattern recognition assisted infrared library searching

Posted on:2009-07-07Degree:M.SType:Thesis
University:Oklahoma State UniversityCandidate:Nuguru, KadambariFull Text:PDF
GTID:2448390002992981Subject:Chemistry
Abstract/Summary:
Scope and Method of Study. The development of a genetic algorithm (GA) for pattern recognition analysis of infrared spectral data is proposed. The GA selects spectral features that optimize the separation of the different functional groups in a plot of the two or three largest principal components of the data. Because the largest principal components capture the bulk of the variance in the data, the features chosen by the GA primarily convey information about differences between classes. Hence, the principal component analysis routine embedded in the fitness function of the GA acts as an information filter, significantly reducing the size of the search space, since it restricts the search to feature sets whose principal component plots show clustering of the spectra on the basis of chemical structure. In addition, the algorithm focuses on those classes and or samples that are difficult to classify as it trains using a form of boosting to modify class and sample weights. Samples that consistently classify correctly are not as heavily weighted as samples that are more difficult to classify. Over time, the algorithm learns its optimal parameters in a manner similar to a neural network. The proposed GA integrates aspects of artificial intelligence and evolutionary computations to yield a "smart" one -pass procedure for feature selection and pattern recognition.;Findings and Conclusions. Using the pattern recognition GA to select spectral features, a search prefilter based on the response function to the simple binary classification problem, carboxylic acids versus other compounds including carbonyl compounds, has been developed that allows for the specific detection of carboxylic acids from IR spectra. Carboxylic acids have highly characteristic features but there are also complications that confound the interpretation of their spectra. The wavelet packet transform has been used to denoise and deconvolute the spectra by decomposing each spectrum into wavelet coefficients that represent both high and low frequency components of the signal. This decomposition process is iterated through successive wavelet packets until the required level of signal decomposition is achieved. Using a symmlet 6 mother wavelet at the tenth level decomposition to deconvolve spectral features, the genetic algorithm for pattern recognition analysis was able to identify wavelet coefficients characteristic of the carboxylic acid functional group. Classifiers developed from these wavelet coefficients have been successfully validated.
Keywords/Search Tags:Pattern recognition, Wavelet coefficients, Search, Algorithm, Carboxylic, Spectral
Related items