Font Size: a A A

Assembling Of Klebsiella Pneumoniae Genome Based On High-throughput Sequencing Technology

Posted on:2014-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:S C ZhangFull Text:PDF
GTID:2250330398999472Subject:Microbiology
Abstract/Summary:PDF Full Text Request
In recent years, the rapid development of a new generation of high-throughputsequencing technology has gradually become an important method to study thegenome. But it also bring challenge to bioinformatics because of its short Reads,low precision and large amounts of data. Therefore, how to filter out valuableinformation from high-throughput sequencing data is one of the hotspots ofbioinformatics and computational biology research.The whole genome ofKlebsiellapneumoniae was sequenced by the widely used Illumina/Solexa sequencingtechnology,in order to establish a working platform to carry out the assembly andanalysis of sequenced data, to explore the regularity of high-throughput sequencingdata processing.The experiments include the following main components:1. For the problem of plasmidpolluted,, this paper puts forward the basicidea of purify plasmid firstly and then sequenced, the method of high temperatureand SDS was tried to eliminate the plasmid.The results show that the best plasmidelimination condition of Klebsiella pneumoniae is forty-five centigrade, the SDSconcentration was0.3%.Under the compound conditions, elimination rate ofplasmid can reach41.7%. Eventually the strain without plasmid was obtained.Offering the appropriate materials to the next step.2. In this experiment, three assembly softwares Velvet, Abyss, SOAPdenovowas used to assemble the raw sequencing data, and simultaneously apply differentassembly parameters to obtain the types of assemblingresult, in order to find theoptimum assembly software and the optimal parameter of the software, providereference for the future research. The optimum assemble software for Klebsiellapneumoniae genome is Velvet, the optimum parameters is K-Mer27.3. Most of the softwaresis based on the de novo assembly, how to assemblethe obtained contigs is still a problem. In this study, using the reference genomeideas.Base on the reference sequence, contigs wasrearranged to forme scaffold, andthen to be assembled in accordance with the overlapping sequence. Eventually get the best scaffold (number=104, N50=98160bp). This method provide a newapproach to the study of the sequence assembly.4. In order to find the most similar reference genome,with the purpose ofimprove the accuracy of rearrangement of contigs. the experiment use the lineNCBI BLAST,16SrDNA evolution tree and Mapping rate.The best reference genomeis Klebsiella pneumoniae KCTC2242.This experiment got most of Klebsiella pneumoniae KG2genome sequence,for the follow-up study of and metabolic pathways and molecular modification laid afoundationUnder the premise of no breakthrough in DNA sequencing method,genome sequence assembly still is an important research in the bioinformaticsdomain.In this paper, the whole genome sequence assembly software was detailedanalysis and researched, and put forward the concrete solution for the referencegene screening and plasmid gene insertion problem, provide a reference for theresearchers engaged in sequence assembly.
Keywords/Search Tags:plasmid elimination, high-throughput DNA sequencing, shortsequence assembly, the reference genome
PDF Full Text Request
Related items