Font Size: a A A

Classification methods and applications to mass spectral data

Posted on:2006-07-08Degree:Ph.DType:Thesis
University:Hong Kong Baptist University (People's Republic of China)Candidate:He, PingFull Text:PDF
GTID:2458390008955976Subject:Mathematics
Abstract/Summary:
An important goal of data mining in chemistry is to try to extract useful information from databases, and then classify and recognize the compounds or medicines by their related molecular structure, topological index or chemical fingerprints. With the growth of chemical measurement and modern information technology, more and more huge databases containing a large amount of chemical compounds information are established, such as spectral databases, chromatographic databases, or databases on molecular structures and their substance properties. How to discover knowledge hidden in huge collections is a big challenge. Our work is mainly on the research of methodology and application of classification methods in huge data sets. In general, the classification methods which are introduced and proposed in this thesis can be applied to various classification problems. Here, we focus on the classification methods and applications in analysis of mass spectra. Mass spectrometry, an instrumental technique which is used to character and identify chemical compounds, produces large amounts of valuable data for chemical structure elucidation. Identification of compounds or automatic recognition of structural properties from mass spectra (MS) data is an important work in chemometrics. In this thesis, we first introduce different of classification methods based on classical multivariate data analysis, artificial intelligence or modern data mining techniques. These methods have been applied successfully to some extent in the automatic recognition of substructures or other structural properties form MS data. However, there are still many substructures which can not be recognized efficiently by existing classifiers. So seeking better techniques for mass spectral pattern recognition has being a mission in chemometrics.; In this thesis, I propose a new approach combining classification tree (CT) with sliced inverse regression (SIR) and apply it to the classification of mass spectra.
Keywords/Search Tags:Classification, Mass spectra, Data
Related items