Font Size: a A A

Gene Similarity Analysis Based On Ramanujan-Fourier Transform

Posted on:2018-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2310330512486603Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,the data on biological genes and proteins gained by scientific researchers is growing day by day.The research focus of bioinformatics has been gradually transiting from obtaining and accumulating the data to analysing and interpreting the data.The massive biological data contains rich biological information,how to extract as much as possible information from the data is a meaningful work.More and more bio-logical,medical and Pharmaceutical researchers have recognized the practicality and importance of bioinformatics.In the meantime,many mathematicians and computer scientists have also been attracted to this emerging interdisciplinary.Similarity analysis of biological sequences is one of the most basic and impor-tant aspects.Issues such as molecular evolutionary and genetic identification are based on biological sequence similarity analysis.Sequence alignment method is the traditional method for biological sequence similarity analysis,Confined to its own disadvantages such as requiring many cus-tom parameters,computation-intensiveness and time-consuming,the alignment-free method is proposed as the supplement and development of alignment method,which has been rapidly developed into one of the research hotspots of similarity analysis.Based on the Voss mapping and Ramanujan Fourier transform,we im-proved obtained an alignment-free method focusing on gene sequences.We then study the biological sequence similarity analysis,and construct some phylogenetic trees.In this paper,we proposed an biological sequence alignment-free method based on Ramanujan Fourier transform power spectrum,and use it to analyse the simi-larity of sequences in the databases selected.We represent DNA sequences as four binary indicator sequences and apply the improved Ramanujan Fourier transform on the indicator sequences,then gain a group of RFT coefficients.The Euclidean metric of the RFT coefficients are used as similarity measure.We use the un-weighted pair group method with arithmetic means to construct phylogenetic trees based on the metric obtained.In order to compute RFT coefficients of sequences with different length,we pad zeros to short DNA binary sequences so that the length of binary sequences equals the longest length in the database.Thus,the DNA sequences are compared in the same dimensional space without information loss.The result is compared with discrete Fourier transform method and multiple sequence alignment method,the improved method has relatively good clustering results and better computation quantity.
Keywords/Search Tags:DNA sequence, Alignment-free, Similarity analysis, Phylogenetic tree, Discrete Fourier transform, Ramanujan Fourier transform
PDF Full Text Request
Related items