Font Size: a A A

Construction,Optimization And Application Of A Gene Network Information Search Engine

Posted on:2016-07-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:M M LiangFull Text:PDF
GTID:1220330470451748Subject:Crop Science
Abstract/Summary:PDF Full Text Request
With the rapid development of high-throughput technologies, various types of biological researches have generated massive amounts of data, while bioinformatics and computational biology are also developed for efficiently mining biological information derived from these huge amounts of data. However, further understand ing and explaining the complex phenotypes of organisms are still a problem. The biological process and physical activity participation are a complicated network system. Biological network study is a key to understanding the complex life activities. At present, more and more candidate genes and biomarkers associated to complex traits have been identified through Genome-Wide Association Studies (GWAS) or other biostatistics methods. But the results obtained from GWAS can only indicate candidate SNPs and genes, therefore further researches are needed to be conducted for prioritizing and validating. We try to integrate existing information of biological networks, and combine various types of data to establish a unified and efficient, convenient, reliable, and scalable visualization of gene network search engine. We have built services platform of gene network database, which is an integration of various biological network information, and has ability to store, search and visualize biological network information. We take biological concepts including but not limited to genes, proteins, and other phenotypes as nodes in the network, relations such as protein-protein interaction, gene regulation, gene phenotypes association, and pathway relations as lines in the network. We have collected almost all kinds of biological concepts and relation data, and done a series work of scores and format standardization, and developed the methods of integrating various biological data, relationship types classification and scoring criteria. Complex biological network is a great challenge for the data storage, computing performance, and stability of the search engine platform. We have carried out a series of adjustment and optimization to the hardware platform, operating system, search engine, and user interface frameworks, to establish a stable, fast response, user-friendly system.With the development of this platform, we can efficiently utilize multi-dimensional networks to help gene prioritization, validation in genome-wide association analysis, and provide in-depth bioinformatics knowledge mining. We performed genome-wide association analysis on type Ⅱ diabetes and nicotine dependence data separately, and use BiopubInfo platform for subsequent analyses. Type Ⅱ diabetes is a typical complex disease and has great impact on human health and life. Study on gene regulation and biological processes of this disease can play a significant role in the prevention and treatment of Type Ⅱ diabetes. We use GMDR-GPU program to calculate the WTCCC data of type Ⅱ diabetes, and obtained one to five-dimension significant SNPs associated to six genes of type Ⅱ diabetes. Through biological network information analysis of these six genes, we found that three genes have been reported associated with Type Ⅱ diabetes and related traits, and the other three genes newly discovered have a lot of biological connections with the former three genes. This result provided evidence for statistical analysis, and also as a point of view explains why these three genes are found only in the multidimensional calculation with other SNPs. Nicotine is an highly addictive drug, quit smoking is very difficult because of nicotine dependence, while smoking is harmful to human health. We used QTXNetwork to perform genome-wide association analysis on nicotine dependence data obtained from dbGAP. In order to avoid interference with other addictive behaviors, we used data from other four drugs addiction to nicotine dependence and conducted conditional analysis. We analyzed three groups of genes obtained from conditional and non-conditional analyses using our platform, and observed three network patterns that have obvious differences. To some extent, it illustrated the effectiveness and necessity of the condition analysis. We also found some evidences from the network diagram to explain why some genes showed up only when the interaction effect was demonstrated. Through the analysis of these two examples, we illustrate this gene network search platform can provide helps in subsequent analyses and verification of GWAS results.
Keywords/Search Tags:Biological Networks, Gene Networks, Database, GWAS
PDF Full Text Request
Related items