Font Size: a A A

Research On Prediction Of Horizontal Gene Transfer In Bacterial Based On Sequence Feature

Posted on:2013-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y P TanFull Text:PDF
GTID:2250330401450693Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Horizontal gene transfer, which is also known as lateral gene transfer is any process inwhich an organism incorporates genetic material from another organism without being theoffspring of that organism. Horizontal gene transfer provides the species some biologicalcharacteristics which can help the species better compatible with its ecological environment.Studies have demonstrated that pathogenic bacteria can change genetic material betweendifferent commensal bacteria and probiotics, acquiring exogenous DNA by horizontal genetransfer is a common phenomenon in bacteria, in many cases, it may be a key factor inbacterial evolution. Therefore, the prediction of HGT not only has a very important biologicalsignificance, but also has the medical significance.HGT prediction methods can be roughly divided into four categories: atypicalcomposition, anomalous phylogenetic distribution, abnormal sequence similarity and unusualphyletic patterns. Our major study is the first method, the method is a relatively old but stillhas a wide range of applications, and its computation is small and does not need so muchprior knowledge, such as orthologous sequences matching.The major problem of this methodis that the same sequence features is used in all parts of the genome, will mistake a proteinsequence with high conservation for the horizontal transfer of genes, and the false positiverate is high.To address this problem, first, this paper considering the feature of dinucleotide relativeabundance and codon absolute frequency, defines four dinucleotide relative abundanceaccording to the position of codon, then optimized the four new features by a grid method asa new feature to predict HGT; second, traditionally we only use one similarity measurementstandards, one evaluation criteria, in this paper, we use four similarity measurement standards,three evaluation criteria. Finally, in setting the threshold value, the threshold is not set by theprevious method based on the difference in the mean and variance of the windowcharacteristics and sequence of the overall characteristics, but is set according to theory thatthe composition of sequence within the specie is the same but different with the aliensequence,regardless of which similarity measure we chose,the score of sequence is close toeach other,also its derivative.So this paper uses the nearest neighbor algorithm to classifyderivative of the sequence to distinguish HGT.
Keywords/Search Tags:Horizontal Gene Transfer, Sequence Characteristic, Dinucleotide RelativeAbundance, Nearest Neighbor Method, Threshold
PDF Full Text Request
Related items