Font Size: a A A

The Research On Molecular Network Of Complex Disease Based On Multiple-order Information

Posted on:2016-05-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:X T YuFull Text:PDF
GTID:1220330482964118Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
With the development of bioinformatics and the appearance of multi-level high-throughput omics data, the research of life science enters the big-data era with new revolutions. The biological research is extending from single molecule to multiple molecules with biomolecular interactions. Computational systems biology provides theoretical basis and technologies for mining biological big data. Especially, the network is gradually used to analyze different high-throughput data as a powerful tool to characterize the data. Complex diseases as major killers of human health, their pathogeneses, early diagnosis and treatment attract the most attention in biomedical fields. How to use melecolar network and omics data to understand the pathogenesis and disease genes is an urgent question at present.In this paper, we construct new models of molecular network to analyze the pathogeneses, early diagnosis and enriched biological pathways of complex diseases by combining the mathematical characteristics of complex network and high-throughput omics data. Nowadays, the disease molecular network actually cannot completely reflect the random fluctuations and noise of real network. Firstly, we introduce the edge-network model and algorithm. Secondly, due to the limitd data from rare clinical tested samples, we use multiple order statistics to construct molecular network based on one sample, i.e. differential expression network; and further use this new network model to improve the enrichment analysis of biological pathways involved in diseases. Our main research work and findings include:1. To establish a new molecular network model with the first and second order statistic information from biological omics data (i.e. gene expression data). The usage of multiple order information of the data can reconstruct the dynamical process of real biomolecular network as far as possible. Based on the theory and application research, we found that the network combined with second order statistics (i.e. edge network) can reduce the candidates of disease-causing genes, which would accurately find the disease genes. By the analysis on a temporal gene expression data of H3N2 influenza infection, we found that the disease genes identified by edge network can not only effectively predict the virus infection but also can represent the forecast results as early as possible. That is to say, these genes will be effective biomarkers on the early prediction of H3N2 flu infection.2. To construct the single-sample molecular network with multiple order statistics from single-sample data. Different from basic theory research, sample is very limited in clinical diagnosis, so that, the single-sample molecular network will be more practical in the application of complex diseases study. By designing reasonable quantitative indicators with additivity, we found that the differential expression network (based on the integration of gene expression and gene expression correlation) can extract more differential information between normal and disease network, so that, it can improve the precision and robustness of disease prediction. In the analysis of gene expression data about prostate cancer and diabetes, we have the similar achievements:comprehensively assessing the contributions of various differential information in the disease predictions; mining significantly differential modules with biological significance; identifying functional modules related to the heterogeneity of disease, e.g. alternative splicing; and screening module biomarkers with high precision and robustness.3. To develop integrative enrichment analysis based on differential expression network, in order to compare multiple-order information networks and the traditional first-order information network. Traditional biological pathway enrichment analysis only consider the differences in first-order information, by contrast, the integrative enrichment analysis will consider the difference in the second-order information at the same time, which provides a new perspective to study disease molecular network. By designing a new hypergeometric test for double-level differences, we found that integrative enrichment has great ability in analyzing heterogeneity samples because it combines the differences both from expression mean and variance. In the model assessment, we compared a lot of state-of-the-art enrichment analysis methods on a variety of disease datasets, whose results strongly support the advantage and scalability of integrative enrichment analysis. And in the study of a typical heterogeneous disease, i.e. diabetes, integrative enrichment analysis identified the dys-regulated pathways effectively, and the potential diabetes subtypes marked by these pathways.In conclusion, our study in disease molecular network has certain contributions to understand the disease pathogenesis and early prediction. The multiple-order information molecular network (i.e. edge-network) contains more comprehensive information of omics data, which can more precisely recognize pathogenic genes, as well as provide a more sophisticated analytical tool in disease prediction. And the single-sample molecular network (i.e. differential network model) overcomes the lack of sample data in actual applicaiton, and provides a theoretical and technical basis for personalized medicine.
Keywords/Search Tags:Complex diseases, Multiple-order information molecular network, Single—sample molecular network, Enrichment analysis, Personalized medicine
PDF Full Text Request
Related items