Font Size: a A A

Random Forests Algorithm Study And Its Application In The Metabolic Fingerprints

Posted on:2014-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:Q H WuFull Text:PDF
GTID:2254330425973703Subject:Analytical Chemistry
Abstract/Summary:PDF Full Text Request
Abstract:Metabolomics as a new system developed branch of biology, its studied contents are the indexes of studied groups and the dynamic metabolites change of tissue and cell system or the whole living organisms. As the development of high-throughput analytical technologies in metabolomics studies, large amounts of highly complex data have been generated. Sophisticated computational approaches are required to extract and interpret the information hidden in complex ’omics’data.This paper focuses on the random forests and its application in the metabolic fingerprint, and its contents mainly includes several respects as follows:(1)Based on the qualitative and quantitative information of small molecules that is endogenous metabolites in urine samples of C57BL/6J mice and diabetic KK-ay mice which were in different cycles treatment with repaglinide/rosiglitazone, using principal component analysis(PCA) and random forests(RF) constructed metabolic trajectory model diagram of diabetic mice with their treatment. The results show that compared to PCA, RF can get better clustering information, and its metabolic trajectory with treatment is clear and visual. By analysising five major metabolites screened by RF, explored the efficacy and its mechanism of hypoglycemic agents repaglinide which treats diabetes. The results got by current research demonstrated that RF was a versatile classification algorithm, which was suitable for the analysis of complex metabolomics data and would be a complement or an alternative to pathogenesis and pharmacodynamics research.(2)Based on the qualitative and quantitative information of small molecules that is endogenous metabolites in urine samples of C57BL/6J mice (male and female) and AMPK gene knocked-out mice (male and female), using random forests to get clear clustering information. Simultaneous, some informative metabolites have been successfully discovered by means of variable importance ranking in RF program.The experimentation has been designed as two steps:firstly, the normal male and female mice were compared with male and female C57-AMPK gene knocked-out mice, respectively; then the differences between male C57-AMPK gene knocked-out mice and female C57-AMPK gene knocked-out mice were further detected. Finally, not only the differences between the normal C57mice and C57-AMPK gene knocked-out mice were observed, but also the gender-related metabolites differences of the C57-AMPK gene knocked-out mice were obviously visualized. The results obtained with this research demonstrate that combining GC/MS profiling with random forest is a useful approach to analyze metabolites and to screen the potential biomarkers for exploring the relationships between AMPK and diabetes mellitus.Above studies show that using random forests algorithm to analyze the metabolic fingerprint can get a good clustering information and to explore the potential of biomarkers, and can provide rational powerful basis to get further comprehensive analysis of studying drug efficacy and gene’s impaction on the disease.
Keywords/Search Tags:Random Forests (RF), Metabolic trajectory, metabolicprofiles, multi-dimensional scaling (MDS)
PDF Full Text Request
Related items