Font Size: a A A

New Chemometric Methods For Analysis Of Complex Systems In Metabonomics And Chinese Herb Medicine Studies

Posted on:2011-07-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:D L YuanFull Text:PDF
GTID:1224330335988973Subject:Applied Chemistry
Abstract/Summary:PDF Full Text Request
The development of chemometric methods and modern hyphenated apparatus provides more powerful tools for us to explore the comprehensive chemical systems opposed to traditional analytical methods. At the same time, the analytical chemistry is promoted to the information, even knowledge provider, not just a data supplier anymore. The primary goal of this study is to develop some novel chemometric methods to deal with the problems encountered in investigation of complex multi-components system and to extract the information hidden in the obtained analytical data. The five parts were included in this thesis as follows.1. A novel pattern recognition method, named uncorrelated linear discriminant analysis (ULDA), were introduced and applied to analyse the concentration data obtained from metabonomics study. ULDA aims to maximize the separation between different classes to find out the discriminant vectors with best classification ability. At the same time, the obtained discriminant vectors are uncorrelated with each other to eliminate the informative overlap of these vectors. The results of investigations on simulative dataset and real plasma fatty acids dataset have shown that ULDA provided a better discriminant model than PCA and PLS. Furthermore, ULDA has successfully screened the feature variables as potential biomarkers.2. Based on fuzzy theory, fuzzy system analysis (FSA) method was developed for the data analysis of metabonomics. This novel and unsupervised method integrates the advantages of fuzzy clustering analysis, fuzzy cluster loading model and target projection-selective ratio methods. This method provided a complete data analysis strategy for metabonomics researches, including exploration of data with clustering method, interpretation of clustering results and identification of biomarkers. The NMR spectra of 613 diabetes patients were investigated by the proposed method. The cluster with high risk of death was found and then the relationship between this cluster and the peaks of NMR spectra were disclosed. Several chemical compounds in serum of diabetes patients were identified as potential biomarkers.3. A new method for variable selection in discriminant analysis was proposed to deal with the problems of variable selection and biomarker screen in pattern recognition. This method combined the methods of Monte-Carlo cross-validation (MCCV), uninformative variables elimination (UVE) and partial least squares- linear discriminant analysis (PLS-LDA). The proposed method can select out the informative variables to establish a good discriminant model and sort their importance for model by the help of C value plot. Two datasets of simulative mixture samples and real Chinese herbal medicine samples were investigated by this method and conventional UVE-PLSR. The results showed that the proposed method has selected out informative variables and established a better discriminant model for bioactive levels of antioxidant of samples. Furthermore, the compounds with potential bioactivity of antioxidant were identified and their bioactive strength was sorted by the help of C plot obtained during data analysis.4. The quantitative structure-retention relationship (QSRR) model is a useful assistance on qualitative analysis of complex multi-components systems, since the retention index is an important qualitative way in chromatographic analysis. A novel modeling method based on subspace orthogonal projection was developed. In this approach, the molecular descriptors coming from same family are grouped as a "block variable" With the help of subspace orthogonal projection, the block variables are orthogonalized to eliminate the redundancies existed in them. Each orthogonalized block variable can derivate a regression direction, and the final QSRR model is established based on all obtained directions. The results showed that the proposed modeling method might provide a QSRR model with better performance compared with PCR and PLSR. At the same time, the Mahalanobis distance was proposed to define a predictive domain of established model, which can clarify the practice range of the model.5. Alternative moving window factor analysis (AMWFA) was applied to solve the problem of embedded peaks occurred in metabonomics researches. AMWFA extracts selective information from two analytical systems by alternative scanning and comparing between two systems. On the basis of the selective information obtained from chromatograms and/or spectra of two systems, the AMWFA approach can resolve the embedded peaks in GC-MS two-dimensional data into pure chromatograms and spectra without any model assumption on the peak shape. The resolution results obtained from one simulated data and two real metabonomics data demonstrated the analytical procedure and the performance of the proposed approach and indicated that it will be a promising way for analyzing complex data from metabonomics researches.
Keywords/Search Tags:Chemometrics, Complex system, Hyphenated apparatus, Metabonomics, Chinese herbal medicine
PDF Full Text Request
Related items