| Identification of gene sequence is an important area of research in the field of biology,and it mainly applies technology of computer simulation to simulate experiment. Promoter isan important elements, it mainly regulates the transcription and translation of gene sequence.As a result, its recognition is an important part of the gene sequence recognition. And that thisarticle’s research manly focused on plant promoter recognition.Currently in promoter recognition, the research for eukaryotic promoter recognition isrelatively abundant, especially it focuses on the research of mammal promoterrecognition(especially human promoter), but aiming at another important component ofeukaryotic promoters—the plant promoter recognition, its research is relatively smaller. Inrecent years, with the plant promoter data is increasingly riched, the research of itsidentification also became a zealous problem that scholars attention to. But in the process ofits identification, the research exists higher false positives at large, so reducing false positivesbecome a rub in the process of its identification.We deeply studies the characteristics of plant promoters in this paper, and the paperabsorbs and borrows a lot of research literature, and it also aims at a series of problems basedon the process of identification, two improved plant promoter recognition algorithm areposed.Plant promoter recognition algorithm based on rough sets and double-stranded DNAcharacteristic is posed, the algorithm sets the advantage of the rough set extracting mainfeatures, and good classification performance of SVM. First the content characteristic and thestructure characteristics of the promoter are extracted, then its main features are selectedaccording to rough set, so the main structural features and main content feature are extractedabout distinguishing between promoter and non-promoter, and together with thedouble-stranded feature of DNA, as the input vector to input the SVM, thereby we achieve thegoal of using the SVM classification. SVM classifier of the algorithm is composed of fivesub-classifiers, respectively3’UTR-promoter sub-classifier,5’ UTR-promoter sub-classifier,Intergenic-promoter sub-classifier, CDS-promoter classifier and exon-promoter sub-classifier.Final we discriminate gene sequence through the outcomes of the five sub-classifier.Plant promoter recognition algorithm based TATA-box and on preference of GC.specialties of the algorithm is to use the structural difference between TATA box promoterand TATA-less promoter and preference of GC. firstly the promoter is divided into TATA-box promoter and TATA-less promoter, then two kinds of promoter combined with thenon-promoter respectively are divided into preference of GC sequences and non preference ofGC sequences, and then the structural features for classification are extracted, the algorithmstill uses SVM classifier.Results show that these two algorithms have achieved better results. |