Font Size: a A A

Analysis And Prediction Of Complex Traits Related MicroRNA And Genetic Loci

Posted on:2016-04-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:L LiFull Text:PDF
GTID:1224330467998477Subject:Bio-IT
Abstract/Summary:PDF Full Text Request
Most of the human diseases and agronomic traits are complex traits, affected by a large number of genetic factors as well as environmental factors. Dissecting the genetic factors of the complex traits plays important roles in understanding human diseases’ pathologies, helps disease diagnosis, prevention and drug design, and opens an opportunity to improve livestock and crop breedings. Therefore, it’s of great scientific and practical values.Many complex traits have been mapped on the genome, leading significant findings. However, the loci detected by current mapping technologies have low reproducibility across multiple environments, and many variants with small effects are neglected, resulting in difficulties to comprehensively understand the molecular basis of complex traits, and obstacles to accurately predict phenotypic values. Moreover, it’s reported that non-coding regulatory elements, including miRNAs, is closely related to complex traits. Therefore, this study focused on the complex disease-related miRNA prediction and the genome-wide analysis of the agricultural traits of Brassica napus, including:(1) We established the associations between miRNAs and cancers, based on which we further predicted potential cancer related miRNAs. First, we extracted associations between miRNA and cancer from MEDLINE abstracts, and built a database named miCancerna. Next, we constructed a miRNA-cancer bipartite network and applied random walk with restart algorithm to detect the miRNA related to twenty common cancers. Cross-validation results showed that this prediction model achieved an AUC of0.798, higher than existing methods. Furthermore, the potential cancer related miRNAs were then predicted, and the71%of the top5candidates of these20cancers were supported by experimental evidence.(2) We proposed a new miRNA functional similarity measurement, inferring the functional similarities based on text similarities, based on which we further prioritized the miRNA-diseases associations. We first established the similarities between a large group of miRNAs (1,007miRNAs in total) based on MEDLINE abstracts. Reliability assessment results indicated that this set of similarity scores accord with biological hyposthesis, explaining the miRNA expression similarities and is able to tell disease miRNA pairs from random ones. Based on the miRNA functional similarity network, we utilized random walk with restart method to predict disease-related miRNAs. And the cross-validation results demonstrated that this method is capable of detecting genuine disease-miRNA associations. We also applied this method to uncover the potential miRNAs underlying colonic, endomertrial, lung neoplasms and cardiomyopathies, and majority of the predicted candidates were supported by literature. Compared with other miRNA functional similarity measurements, our text-based measurement has a large application scope, is more reliable and has larger potential to uncover new disease-related miRNAs.(3) Other than human disease-related miRNA prediction, we also studied the complex economical traits of plants. We made a comprehensive assessment of genome selection models in flowering trait of B. napus, and applied a genome-wide mapping and functional analysis. We first built a high-density and high-throughput genome-wide SNP set. We first performed genomic prediction of FT traits in B. napus using SNPs across the genome under ten environments of three geographic regions via existing genomic predictive models (including linear and semi-parametric methods and machine learning models). The results showed that all the models achieved comparably high accuracies, verifying the feasibility of genomic prediction in B. napus. Next, based on the genome breeding values combining flowering time traits from different environment of a geographic site, we performed a large-scale mapping of FT related loci based on random forest model, and epistasis analysis via multivariate adaptive regression spline models.437SNPs were detected, some of which represented known FT genes, significant interactions were found between known FT related genes. Function analysis indicated that the detected SNPs participate in flower development process.(4) Integrating protein-protein interaction network, gene regulatory network and a high-density genome-wide SNP set, we applied a network-based analysis to detect oil content loci of B. napus. Compared with single variant tests, this method has higher reproducibility and reliability, and is capable of uncovering common genetic factors of multiple environments. Moreover, the candidate genes detected by this method is enriched in oil accumulation pathways, and temperature stimuli response functions, implying oil content of B. napus is highly affected by environmental effects. Taken together, in this study, we utilized a variety of methods to analyze complex traits including human disease and agronomic traits, and developed several methods for disease-miRNA prediction and genome-wide mapping of flowering time and oil content of B. napus, which would be helpful for disease mechanism study and improvement of yield and quality of plants.
Keywords/Search Tags:Complex traits, disease-related miRNA prediction, text-mining, miRNAfunctional similarity, Brassica napus, genome selection, genome-widemapping
PDF Full Text Request
Related items