Font Size: a A A

Detection Based On Signature Of Large-scale Pathogen

Posted on:2016-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:S J WangFull Text:PDF
GTID:2284330479984908Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the improvement of people’s living standard, human health and safety issues are getting more and more attention. In recent years, with the emergence of all kinds of new epidemic diseases, the health and safety of human have been subjected to the unprecedented threat. These diseases are caused commonly by bacteria, viruses and other common pathogens hiding in the human body by mutation, recombination, and evolution mode with a long time. The diseases are hard to find early separation of pathogens and it is difficult to control once the epidemic breaks out which poses a great threat to the safety of human life. If these early onset epidemics can be detected through the certain techniques, it not only can narrow the scope to identify suspicious pathogens but also can provide directions for subsequent isolation and identification of pathogens,serological testing,clinical diagnosis and symptomatic treatment in epidemic diseases.Currently, the widespread use of high-throughput detection surveillance technology is not mature, and the country is still a lack of the ability to self-rapid high-throughput detection of pathogen. To have the autonomy to the rapid high throughput detection of pathogenic examination ability, the key lies in the efficient identification of the DNA sequence of the pathogen.That is to say, the target species in environmental samples can be sharply navigated.。This paper puts forward the rapid method to calculate the signatures of the pathogens. The DNA signatures are nucleotide sequences that can be used to detect the presence of an organism and to distinguish that organism from all other species. That is to say, it can be the sole representative of the target species.This thesis collected and reorganized the common affiliation list of microbial pathogens,and downloaded the original sequence data in a well-known database by NCBI identification and relationship between species. And then the sequence can be described and annotated according to the experimental requirements so as to construct the whole sequence of pathogen database. In the following part, the matching information database of the whole genome sequence is established after comparing by the whole-genome alignment algorithm MUMmer and taking advantaging of the cluster scheduling system to make alignment process paralleling and high efficiency. Based on the middle matching information database, the signatures sequence of pathogens can be calculated in linear time. In this article,we intersect the match information in the target species, to get the sequence shared by all genomes in the target set; and compute the union of the match information in the background species, to get the specificity sequence respect to the background.The sequence shared in the target set and specified in the background is the signature of the pathogens.The signatures should be further computed and screened because the number of the species is few so that the obtained signatures can’t stand for the species only. In this article adopts the Blast for screening the signatures i.e., the obtained signature should be undergone similarity search in nucleic acid sequence database.By analyzing the Blast output file and setting the threshold of qcovhsp, the screening of signature sequence can be realized. Therefore, it is convenient for biological researchers and medical staff to use the signature database of the pathogen according to the relationship between the species.
Keywords/Search Tags:Pathogen, Detection, Signature Sequence, Sequence Alignment
PDF Full Text Request
Related items