Font Size: a A A

Research And Implementation Of Exon Prediction Method

Posted on:2020-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:F XiaoFull Text:PDF
GTID:2370330572988454Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the coming of the era of big data,biology-related data has emerged in large numbers.The explosion of data has promoted the rapid development of bioinformatics.Gene research is an important part of bioinformatics,and exon prediction plays an important role in the field of gene research.For example,in the process of protein formation,exons are transcribed into mRNA firstly,and then the mRNA is translated into protein.The conserved regions of genes are usually associated with exons.Exons can also be used to analyze the evolutionary relationship between organisms and construct evolutionary trees.In recent years,the whole exon sequencing technology has developed rapidly,which can effectively find pathogenic mutations.Therefore,accurate prediction of exons is of great significance.However,the content of exons in genes is low and the exons link alternately with introns.The characteristics of exons are not obvious.Predicting exons has become an important and difficult problem in bioinformatics.At present,exon prediction methods can be divided into two types.One is exon prediction method based on sequence alignment,and the other is exon prediction method based on statistical analysis.In this paper,two methods are studied respectively.An exon prediction method based on BLAST pairwise sequence alignment and a prediction method based on exon feature statistics are proposed.The main research work of this paper is summarized as follows.(1)According to the BLAST pairwise sequence algorithm,an exon prediction algorithm based on BLAST algorithm is proposed.BLAST is a heuristic algorithm.When BLAST compares two sequences,too many local optimal alignment fragments are generated,and the boundaries are blurred in the prediction results.The improved method was proposed in this paper is to compare the DNA sequence with the RNA sequence several times.After each alignment,the DNA sequence is optimized according to the alignment results,and the alignment range is gradually reduced.Then,the boundary of the final result is segmented according to the features of exons,and the segmented result is the predicted exon sequence.(2)According to the statistical features of exons,an improved exon prediction the improved method based on statistical feature is proposed.In the feature statistics stage,100 human genes and 1224 exons corresponding to genes were downloadedfrom GenBank database of NCBI website,and the11 characteristics of genes and exons were extracted from these data.In the design stage of the method,repetitive sequence analysis is performed to segment the non-repetitive sequence fragments.In the method design stage,the gene is subjected to repeated sequence analysis to segment the non-repetitive sequence fragments in the gene.According to the characteristics obtained in the statistical stage,the position of exons in the non-repetitive sequence is determined step by step,and the predicted exons sequence are outputted.
Keywords/Search Tags:Genes, Exons, Sequence aligenments, Statistical analysis, Prediction method
PDF Full Text Request
Related items