Font Size: a A A

Research On Rapid Detection Algorithm Of Metagenomic Pathogen Based On Specific Region

Posted on:2022-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:B HongFull Text:PDF
GTID:2544306323972079Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The continuous innovation and improvement of sequencing technology has gradually reduced the cost of sequencing to a clinically acceptable range.The sequencer can obtain hundreds of millions of nucleotide sequences in a single experiment,and the large amount of sequence information generated by sequencing allows researchers to quickly analyze clinical pathogens and convenient analysis of the composition of pathogen samples.Metagenomic sequencing is the sequencing of all microorganisms in the environment of the target pathogen,not only the pathogen itself.This not only overcomes the specificity and efficiency of traditional methods,but also brings new insights to the study of how various pathogens cause harm to human health.First of all,this paper selected two main pathogen infection sources including foodborne pathogens and common human viruses as the research objects.Using Panseq and Blast online comparison tools,the specific region of ten common food-borne pathogens and ten common human viruses were produced.When the real metagenomic sample data is not sufficient,use the wgsim tool to generate simulated data of different abundances and different depths for pathogenic bacteria and viruses.Next,this paper proposes the Snipe algorithm for the rapid detection of food-borne pathogens.The algorithm uses Bowtie2 to align samples to specific regions,and uses Bayesian algorithm to calculate the posterior probability to correct the initial abundance estimates based on the alignment results.In order to comprehensively verify the performance of the algorithm,this paper carried out low abundance experiments,simulated data experiments and real data experiments.Secondly,according to the characteristics of the viral metagenes improve the Snipe algorithm,and the SnipePlus algorithm is proposed.The algorithm first uses snap to filter the human related genomes in the sample,and then optimizes the conditions for judging the alignment to improve the effect of the algorithm.The algorithm performs low-abundance experiments and simulated data experiments.Efficiency and accuracy have always been the main challenges of metagenomic classification and analysis algorithms.Finally,the Snipe algorithm is optimized for efficiency.This paper proposes the Snipe2 algorithm,which is possible to control the loss of strain-level identification accuracy within 10%while increasing the speed by about 5 times.Based on the concept of specific regions,this paper proposes a Bayesian framework that can correct the results of current metagenomic classification and analysis algorithms,reduce false positives and improve accuracy.This paper proposes the Snipe algorithm for food-borne pathogens.On the basis of the Snipe algorithm,SnipePlus is proposed for the characteristics of virus metagenes.In terms of efficiency,the Snipe algorithm is optimized and the Snipe2 improved algorithm is proposed.The three algorithms have achieved the expected results through experiments.
Keywords/Search Tags:Pathogen Detection, Metagenomic Sequencing, Specific Region, Sequence Alignment, Bayesian Algorithm
PDF Full Text Request
Related items