Font Size: a A A

A Novel 2D Graphical Representation For Proteins Based On Graph Energy And Its Application

Posted on:2018-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:D D SunFull Text:PDF
GTID:2310330512986520Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
The number of biological sequences increases fast in the public databases with the rapid development of sequencing techniques.How to infer the potential information of a large number of sequences effectively and accurately and analyze the relationships among biological sequences are important tasks in molecular biology and bioinformatics.Protein is the material basis for lives.The research of protein is focusing on the structures and functions.It was found that the primary sequence determines its advanced structure,and how to use the graphical representation of protein sequences to analyze the biological evolution are very important.We proposed a novel graphical representation for protein on the basis of 6 typical physicochemical properties of amino acids and obtained the graph energy of 20 amino acids via the relationship between amino acids,which is more visual and reasonable.Considering the difficulty in dealing with sequences with different lengths,the advantage of our method is obvious.The main strong points of our method are as follows:(1)In our method,6 typical properties are considered to construct representative graph for every amino acid.The physicochemical properties of amino acids are more important than other factors in determining the rate and pattern of protein evolution.Therefore,these properties have a direct and significant impact on estimation of distance between two polypeptide sequences.(2)The 2D graphical representation of a protein sequence is obtained from the application of the graph energy.The energy of graph is meaningful for analysis of graphs,and it is fit for our unique construction of graphs for 20 amino acids.(3)According to the graphical representation,the protein sequence is converted to a multidimensional vector by a formula,and the dimension of the vector is determined by the theory and experiments.Then,we build the distance matrix of protein sequences by calculating the distances among them to analyze the similarities/dissimilarities of protein sequences.(4)In order to validate the reasonableness and effectiveness of the proposed method,we apply this method on ND5 dataset,24 vertebrates and 36 protein sequences,and the results are consistent with the existing methods and even more reasonable.The results indicate that the graphical representation of amino acids is reasonable and the graph energy of amino acids could identify amino acids.The graphical representation of protein sequences is feasible.
Keywords/Search Tags:Physicochemical properties of amino acids, graph energy, the graphical representation of proteins, similarity/dissimilarity analysis
PDF Full Text Request
Related items