Font Size: a A A

Building The Cluster Of Bacterial Essential Gene Model And Hence Constructing Its Minimal Gene Set

Posted on:2016-10-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y N YeFull Text:PDF
GTID:1220330482981361Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Essential genes are indispensable for the survival of living entities. Recently, study on essential gene is becoming a hot topic of microbiology, medicine, genomics, and bioinformatics. Due to essential genes are important for organisms, they are the cornerstones of synthetic biology, and potential candidate targets for design of antimicrobial and vaccine.In this thesis, we proposed a model of the clusters of essential genes based on the essential genes of organisms, and constructed CEG, the first cluster of essential genes database. Based on the CEG database, we developed a tool named CEG_Match for prediction of essential genes, and built a bacterial minimal gene set and minimal metabolic network. Using the species in CEG database as the reference species, we simulated the fitness of 2186 bacteria, and constructed the first database of integrated fitness information for microbial genes(IFIM). Details of each work are presented as follows:(1) We proposed a model of the cluster of essential genes and construst a Cluster of Essential Genes(CEG) database, which contains clusters of orthologous essential genes. Based on the experimentally-determined essential genes of 16 strains(15 species), and we regarded the genes in these species with the same functions as a cluster. As a result, we obtained 932 clusters with two or more essential genes, and 1929 pseudo clusters with only one essential gene. The CEG database differs from existing databases in that it deposits essential genes in orthologous groups and not as single genes. This will greatly facilitate the use of the researchers. For example, based on the size of a cluster, users can easily decide whether an essential gene is conserved in multiple bacterial species or is species-specific. Moreover, the database contains the similarity value of every essential gene cluster against human proteins or genes. Properties contained in the CEG database, such as cluster size, and the similarity of essential gene clusters against human proteins or genes, are very important for evolutionary research and drug design.(2) Based on the CEG database, the CEG_Match tool was developed for prediction of essential genes according to function ranther than sequence. Thus, a gene with the known function can be predicted whether it is essential or not by CEG_Match, and there is no need to be sequenced. An advantage of CEG is that it clusters essential genes based on functions, and therefore, it decreases false positive rate in predicting essential genes in comparison with the similarity alignment method. Simultaneously, the running time of CEG_Match is much shorter than BLAST.(3) Knowledge of an organism’s fitness for survival is important for a complete understanding of microbial genetics and effective drug design. Current essential gene databases provide only binary essentiality data from genome-wide experiments. In the third part, we integrated the bacterial experimentally-determined essential gene data from CEG and combined with theoretical predictions, developed a novel database that integrates quantitative fitness information for microbial genes(IFIM). The IFIM database currently contains data from 11 bacterial data with single-gene knockout and transpositional mutation experiments in CEG, a yeast experiment, and 2186 theoretical predictions. The highly significant correlation between the experiment-derived fitness data and our computational simulations demonstrated that the computer-generated predictions were often as reliable as the experimental data. The data in IFIM can be accessed easily, and the interface allows users to browse through the gene fitness information that it contains. IFIM is the first resource that allows easy access to fitness data of microbial genes. We believe this database will contribute to a better understanding of microbial genetics and will be useful in designing drugs to resist microbial pathogens, especially when experimental data are unavailable.(4) At last, we constructed a bacterial minimal gene set and minimal metabolic network based on CEG database. Minimal gene set is critical for the assembly of minimal artificial cell. Although a few bacterial minimal gene sets have been reported, they only preserve the system of self-reproduction and have limited metabolic capabilities. To construct a reliable and complete bacterial minimal gene set, we innovated systematically the conventional process with taking experimentally essential genes as the starting point, introducing a half-retaining strategy and neo-constructing a viable metabolic network. A minimal gene set, including 315 genes and 431 reactions, was obtained. Among them, 157 genes were involved in the minimal metabolic network. As far as our knowledge goes, this is the firstly reported minimal gene set to preserve both self-reproduction and self-maintenance systems. Besides confirming 20 known hub metabolites, we found five novel hub metabolites in the minimal metabolic network. Furthermore, it was found that highly essential genes tend to distribute its connecting metabolites into more reactions, possibly indicating a mechanism of bacteria to reduce the lethal risk by retaining more connecting metabolites when destroying one specific reaction. Finally, we discussed the possible implications of the minimal gene set: it may expand the pool of targets for designing broad spectrum antibacterial drugs to reduce bacterial pathogen resistance, may help design a synthetic platform cell for wide biotechnological applications.In summary, we made a comprehensive exploration of the integration and application of bacterial essential genes,and minimal gene set. This work will make contributions to prediction of essential genes, and synthetic biology research. However, there are still some problems, which are needed to be further in-depth studied.
Keywords/Search Tags:Experimentally-determined essential gene, Cluster of essential gene, Fitness of gene, Minimal gene set, Minimal metabolic network
PDF Full Text Request
Related items