Font Size: a A A

Research On Recognition Of Functional Modification Sites Based On Nucleic Acid Sequences

Posted on:2019-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:J W SunFull Text:PDF
GTID:2370330566974119Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Genome-wide profiling reveals that epigenetic modifications are closely related to gene regulation,cell differentiation and disease formation,in this process,it not only involves methylation modification,but also includes many post-transcriptional modifications.Methylation modification of N6-methyladenosine(m~6A),the most common post-transcriptional modification of eukaryotic mRNA,not only plays an important role in gene regulation and expression,but also is closely related to the gene encoding of various diseases such as tumor and cancer.The low degree of methylation of the common CpG sites in DNA sequences is also closely linked to the expression of many cancer genes.Therefore,automatic,accurate and efficient identification of methylation sites on nucleic acid sequences plays an important role in basic biology such as regulation of gene expression,transcription and expression mechanisms,and the development of targeted drugs for various diseases.However,facing the massive nucleic acid sequences in the post-genomic era,the traditional wet experiments did not solve the problem of high cost and high energy consumption for recognition of methylation modification sites on nucleic acid sequences.Therefore,the site recognition models based on Intelligent Computing arise at the historic moment.This paper focuses on the further study of the intelligent calculation models of the methylation sites on nucleic acid sequences,the main work is as follows:(1)The physical-chemical properties of dinucleotide in RNA sequences are researched,and a method to measure the significance of physical-chemical properties is proposed,based on this,heuristic selection algorithm is designed;However,because the traditional heuristic algorithm is easy to fall into the local optimal solution,an improved K-selection heuristic algorithm with the significance measure of physical-chemical properties is proposed.(2)The M6A-HPCS,a methylation modification site recognition method based on the properties of nucleic acids,is proposed.The method is based on two common feature representation methods of pseudo dinucleotide composition and auto-variance and cross-covariance transformation,with the designed K-selection heuristic algorithm,K group optimal physical-chemical properties subsets are select respectively,the best subsets and corresponding feature representations are used to re-encode the sample sequences,and then the support vector machine algorithm is used to construct the final prediction model.Experimental results show that the designed K-selection heuristic algorithm is superior,and the new recognition method M6A-HPCS based on this algorithm can further improve the prediction accuracy of m~6A methylation modification sites on RNA sequences.(3)Single view feature representation methods for methylation modification sites on DNA sequences are researched.Aiming at the one-sidedness of extracting features from single view feature representation method,and a multi-view feature fusion strategy is proposed.(4)Aiming at the prediction problem of CpG methylation sites in DNA sequences,a methylation site recognition method DNA-MFF based on structural properties and statistical information of nucleic acids is proposed.The method integrates three perspective information of frequency statistics,location statistics and spatial structure attribute information,and selects the support vector machine to construct the final prediction model.Experimental results show that the features of the above three perspectives are complementary to each other,the feature vectors,can better reflect the pattern of DNA methylation sites and significantly improve the performance of the site prediction model,which are obtained by using the fusion of these three angles.(5)M6A-HPCS,the optimized m~6A methylation modified site recognition method based on the K-selection physicochemical properties heuristic algorithm has provided the online prediction services,which are convenient for subsequent researchers to further study the modified site.
Keywords/Search Tags:m~6A methylation, Heuristic selection algorithm, DNA methylation, Statistics features, Pattern characteristics, Support vector machine, Online prediction services
PDF Full Text Request
Related items