Font Size: a A A

Database Construction And Molecular Network Analysis For Coronary Artery Disease Genes

Posted on:2012-11-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:H LiuFull Text:PDF
GTID:1114330335455125Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Coronary artery disease (CAD) is a complex, multifactorial disease and a leading cause of mortality worldwide. Over the past decades, great efforts have been made to elucidate the underlying genetic basis of CAD and massive data have been accumulated. There are mainly three types of methods used to study the genetics of CAD, including linkage analysis and positional cloning, candidate gene association study and genome wide association study (GWAS). Using these method, many genes involved in different molecular processes and pathways, such as lipid and lipoprotein metabolism, thrombogenesis, rein-angiotensin system, immune and inflammation, glucose metabolism, degradation and genesis of extracellular matrix, vascular smooth muscle cell abnormalities and others, have been found to be associated with CAD. These findings provided important clues to explore the mechanism for the pathogenesis of CAD. However, due to the complexity of CAD, its genetic basis has not yet been fully explained by these genes, and obviously much more genes are needed to be identified.Because it has become increasingly difficult to find CAD genes using the methods mentioned above, and a system approach has attracted attention as a powerful strategy to tackle this complex problem. Thus for the thesis project, this study has tried to develop a systematic method based on bioinformatics analysis of the accumulated data to study the genetics of CAD. The goal is to find new genes, and uncover the complex relations among all known CAD genes and that among them and the new ones.First, aim to integrate the massive data together for further deep data-mining and provide a useful resource for the research community, this study developed a comprehensive database for CAD related genes (CADgene, http://www.bioguo.org /CADgene/). The CAD-related evidences for 318 CAD candidate genes were manually extracted from over 2,000 publications of genetic studies, and these genes were classified into 12 functional categories based on their roles in the pathogenesis of CAD. For each gene, detailed information from related studies (e.g. the size of case-control, population, SNP, odds ratio, and P-value, etc.) were extracted and useful annotations were made. These annotations include general gene information, Gene Ontology annotations,KEGG pathways, protein-protein interactions and others. In addition, CADgene provides cumulative data from 10 publications of CAD-related genome-wide association studies (GWAS). CADgene has a user-friendly web interface with multiple browse and search functions. It is freely available at http://www.bioguo.org/CADgene/.The CADgene database serves as a highly valuable tool for cardiovascular researchers to explore genes and mechanisms for CAD.Next, this study developed a systemic method to analysis the collected 318 gene step by step:First, a set of programs was developed to screen the whole human genome to identify all closely related genes for the 318 genes in CADgene. For each gene, kinds of genome wide analyses from aspects of protein-protein interaction (PPI) and pathway data were performed to reveal its functional related genes. Finally,752 genes are obtained after multiple iterative analyses.Second, a "P-relation" was defined. A P-relation between two proteins means these two proteins are located in a same pathway and they interact with each other. Using matrix analysis and the defined P-relation.360 of the 752 proteins were selected, which have P-relation with at least one protein collected in CADgene.Third, a molecular network analysis for these P-related genes and original CADgenes from the aspects of P-relation was performed. The topologies of the network formed by these genes, such as three-layer, four-part and seven-network module, was analysed, which uncovered the network relation among these genes. Based on that,179 genes from two of the four parts could be considered as functional related CAD genes with high possibility, such as CAV1, LDLRAP1, NOX1 and CYBB and so on.This study tries to study the genetic basis of CAD mainly by bioinformatics, and as we know, it is the first try. The findings mentioned above preliminary display the power of the systemic method developed in this study and give us confidence, experience and data for later research.
Keywords/Search Tags:Coronary artery disease, Complex disease, Gene database, Molecular network analysis
PDF Full Text Request
Related items