Font Size: a A A

New Methods Of Systematic Factors Correction For Metabolomic Data Analysis

Posted on:2018-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:W L WangFull Text:PDF
GTID:2370330515952482Subject:Physical Electronics
Abstract/Summary:PDF Full Text Request
Metabolomics is a method of quantifying the multi-parameter metabolic response caused by exogenous stimuli or gene modifications in the life system.The individual life exposing in a complex environment will suffer the effects from all kinds of factors,resulting fluctuations in biochemical substances.When the effects of uninteresting systematic factors are too large,they may sometimes cover or confuse the impact of the real studying factors,which will misleading the following analysis of biological information.Thus,it's important to do the systematic factors correction in the preprocessing of metabolomics data.Based on this purpose,two aspects have been done in this article.1.Bringing forward a new correction method called "a clustering-based preprocessing method for the elimination of uninteresting residuals in metabolomics data",short for CURE.In CURE,The variations caused by uninteresting systematic factors are removed by doing K-means data clustering and removal of means for each cluster from the residuals.Series of simulated datasets and a real metabolomics data were applied to evaluate the performance of this new method on the correction of systematic factors.Compared with the ANOVA-based method,OSC and CPF,the CURE was proved to be more valid and robust.The results show that CURE can effectively correct the systematic factors and keep a low risk of over-fitting in the mean time.2.Combining the biochemical data for the correction of systematic factors.The OSC-based new method firstly divides the metabolomics data matrix into two parts,the part parallel to the response(Y)and the part orthogonal to the response(Y).And then projects the orthogonal part on the biochemical data to retain the useful biological information.By means of data fusion,much more biological information can be reserved,which will help to reduce the risk of over-fitting.The results of a real liver disease dataset show that this method can effectively correct the interference of systematic factors,enhance the significance of disease factor,and improve the predictive ability and explanatory ability of multivariate statistical analysis model.At the same time,the permutation test results show that the correction method has strong stability.The study in this thesis provides novel methods for the systematic factor correction of metabolomics data.Both methods can effectively reduce the effects of systematic factors,improve the accuracy of follow-up statistical analysis,and highlight the difference of biological information in data.
Keywords/Search Tags:Metabolomics, Data Processing, Systematic Factors
PDF Full Text Request
Related items