Font Size: a A A

Bioinformatics Analysis Of Microbial Identification, Evolution And Drug Resistance Based On High-throughput Sequencing

Posted on:2017-01-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z W LiFull Text:PDF
GTID:1220330488455784Subject:Drug Analysis
Abstract/Summary:PDF Full Text Request
In recent years, the high-throughput sequencing(HTS) technology has been developing very rapidly and has made many breakthroughs constantly, which gradually leads human beings into the post-genome era. Compared with traditional chips and low-throughput sequencing, HTS can get massive data of genome, transcriptome or epigenome more effectively and quickly. Besides, HTS can also explore unknown sequences in the whole genome scale, which makes it possible to analyze biology comprehensively and solve biological problems from a new perspective. At the same time, the research on the microbial genome can be applied in medicine, industry, agriculture and so on. There are also many international projects such as the Human Microbiome Project(HMP) and the Earth Microbiome Project(EMP). Thus, the combination of HTS technology and microbial genomics plays a significantly important role in the field of microbiology. For example, in the aspects of unknown pathogen detection and infectious disease prevention, HTS can be used to indentify pathogen quickly, and even analyze the information of pathogen host and drug resistance by inferring the possible sources of pathogen and host with the traceability evolution analysis, and through the route of transmission. In the aspect of metagenome, the microbial diversity and metabolic function enrichment can be analyzed by a huge of sequencing data, so as to study the intrinsic relationship between the specific phenotype and the metagenome.As high-throughput sequencing technology is widely adopted in microbial genomics, it also brings a new challenge to the analysis of large volumes of data. The diverse and huge data produced by HTS, with the highly personalized biological problems to be analyzed, make it urgent to develop comprehensive and effective bioinformatics analysis procedure to solve these problems. This research adopts a variety of bioinformatics analyses to conduct microbial pathogen identification, phylogenetic analysis and drug resistance analysis. In this paper, we first conduct a research on the variety of theory and algorithms for sequencing data processing. Not only the proper sequencing data analysis method is produced, but also the optimized data analysis strategy is proposed. Based on pathogen identification, bioinformatics approaches to microbial phylogenetic analysis are further discussed, including phylogenetic relationships, time of divergence, the rate of genetic evolution, protein family analysis and recombination in microbial communities. In addition, the pathogenic analysis for microbial sequencing data is produced, especially on the drug resistance. In the last part of this paper, a series of bioinformatics approaches to microbial sequencing data analysis are summarized, including the universal analysis pipeline for microbial data.Accurate and efficient bioinformatics approaches to microbial sequence identification are of great importance for pathogen identification, prevention and control of infectious diseases and food safety. Thus, this research on bioinformatics analysis method for microbial sequence data is performed in the second chapter. There are mainly two kinds of methods for microbial sequence data, namely alignment and assembly, each of which has more detailed classification. To get reliable microbial identification results, various methods should be adjusted according to the actual situation. Based on mainstream methods, different algorithms and pipelines are selected by comparing the accuracy and efficiency. Here we propose not only a proper method for microbial sequence identification, but also practical guidelines for different situations. At the end of the second chapter, sequencing data of typical DNA and RNA viral pathogens are chosen for practical example. Detailed analysis and verification for pathogen identification methods are based on massive sequencing data from simulation data, public database and in-house sequencing. Finally, we get comprehensive solutions on different data type processing, including various sequencing length, sequencing depth, mixed infection and genome fusion.In the third chapter, we mainly perform a research on the analysis methods of microbial evolution based on HTS data. To be specific, related background and theories are introduced first in this chapter, and then a case of Helicobacter pylori infection is chosen as the example for microevolution. Specifically, the rapid development of HTS technology has produced unprecedented molecular level data, which has brought revolutionary changes to the traditional phylogenetic. Based on "Omics" data, more comprehensive knowledge can be acquired and the bias concerning the traditional method can be reduced as well, while based on HTS data, detailed introduction to related theories and methods of sequencing data analysis is presented, including HTS data pre-processing, model selection, phylogenetic tree construction and so on. The second half of the third chapter involves an example of Helicobacter pylori infection of microevolution analysis. There are as many as 18 Helicobacter pylori strains isolated from the patient. In the subsequent analysis, the recombination events are taken into account, which is different from general method. This method is more accurate in reconstructing the microevolution history of infection, and the time of divergence in different stages of infection is calculated. In our microevolution analysis, all the Helicobacter pylori strains are divided into two different clades. The main driving factors of evolution are recombination events, and the evolution rates of two clades are obviously different. Further analysis on the recombination pattern and the restriction modification system implies that the deficiency of the restriction modification system in high evolution rate clade may cause more recombination events. Moreover, gene exchange among different strains happens during the microevolution process, and finally, the trend of progressive genomic convergence in recent years is found. Above all, we have discussed microevolution analysis from theory to practice, consisting of both general and personalized methods, which can be regarded as methodology guideline.Drug resistance analysis of microbe genome is of great significance in the diagnosis and treatment of diseases. And in the aspect of drug resistance, the research of drug resistance of Staphylococcus aureus is introduced as an example in this paper. We find quinupristin/dalfopristin-resistant(collectively termed QDA, which is not yet marketed in China) in three porcine S. aureus ST9 isolates. Whole genome sequencing is used to investigate the genetic features of drug resistance, and we identify that it is lsa(E) gene, which encodes an ABC transporter, that may be responsible for QDA resistance. In further comparative genomic analysis, all the three isolates are deficient in type III restriction-modification system. The analysis method mentioned in this part can be used as a reference for similar research.At the end of this paper, all of our studies are summarized, including pathogen identification, phylogenetic analysis and drug resistance analysis based on HTS data. Besides, the future research plan is proposed.
Keywords/Search Tags:high-throughput sequencing, microbiome, bioinformatics, algorithm and pipeline
PDF Full Text Request
Related items