Font Size: a A A

The Research And Software Development On Prediction Algorithm Of Promoter

Posted on:2018-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiangFull Text:PDF
GTID:2348330515951771Subject:Biophysics
Abstract/Summary:PDF Full Text Request
Promoter is a section of regulatory element and is in charge of the transcription of genes.In prokaryote,the promoters can be specifically identified by factors of RNA polymerase.In E.coli?Escherichia coli?,the ?70 promoter controls transcription initiation for the most of essential genes,therefore it is also called ‘housekeeping promoter'.The promoter of B.subtilis?Bacillus subtilis?holds the same function with ?70 promoter of E.coli.Accurately identifying promoters from whole genome sequences is of great significance for further understanding the regulation mechanism of genes due to the important role of promoters.With the appearance of a mount of genome data and the development of computational method and equipment,it is highly desired to identify promoters by using the machine learning methods.Despite several computational approaches developed for predicting ?70 promoter in E.coli genome and ?43 promoter in B.subtilis genome,these models just used the short-range sequence information and disregarded the same important long-range sequence order.Therefore,this thesis proposed a new methodology named PseZNC for formulating ?70 promoter samples.Both short-range and long-range sequence information were considered in this model.The local-range sequence information was obtained via multi-window Z-curve composition which describes base composition.The correlation of physicochemical properties between two dinucleotides was utilized to describe the long-range information The support vector machine was selected as classifier algorithm.In 5-fold cross-validation test,the proposed method achieved accuracy of 84.54%,Sn of 80.30%,Sp of 84.54% and AUC value of 0.9088 for E.coli promoter's prediction,and obtained accuracy of 92.20%,Sn of 88.89%,Sp of 93.83% and AUC of 0.9650 for B.subtilis promoter prediction.These results proved that PseZNC is an efficient and reliable predictor,and it has quite great potential application in other regulatory elements recognition.To facilitate researchers to further study promoters of E.coli without repeating this method again,we developed a user-friendly,convenient and practical online web sever named iPro70-Pse ZNC,which can be freely available at http://lin.uestc.edu.cn/server/iPro70-PseZNC.
Keywords/Search Tags:promoters, PseZNC, multi-window Z-curve, physicochemical properties
PDF Full Text Request
Related items