Font Size: a A A

Statistical Characteristics And Critical Behavior Of Protein Networks Based On Coevolutionary Analysis

Posted on:2021-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:J X ShiFull Text:PDF
GTID:2370330602479492Subject:Physics
Abstract/Summary:PDF Full Text Request
In many real systems,although the interaction between individuals is relatively simple,it tends to show some new phenomenon on a larger scale,namely emerging behavior.Such systems are usually classified as complex systems.The emerging behavior of complex systems is one of the central focuses of modern scientific researches.It is difficult to describe such behavior directly with the existing laws of physics.In recent decades,as the efforts of describing the behaviors of such kind of complex systems,people have been trying to develop new methodologies and tools,in which the ideas and methods developed in the studies of complex networks were used to characterize the behaviors of complex systems.Biological system is often considered as typical many-body complex system,which can show complex behaviors at very different levels.Among them,the behavior of biological macromolecules,i.e.,protein,is particularly interesting.Protein molecules are not only the structural basis for constructing individual organisms,but also the main performers of various biological functions.They play an extremely important role in biological system and are typical complex systems.In order to carry out their biologically relevant functions,protein molecules usually need to fold into a well-defined three-dimensional structure.Therefore,the interaction between residues needs to be strong enough to enable protein molecules to fold quickly and maintain a stable three-dimensional structure.On the other hand,when proteins perform biological functions,they need to have appropriate flexibility to change rapidly from one conformation to another,which requires that the interactions between residues should not be too strong.It is speculated that in order to simultaneously meet these conflicting requirements for protein stability,the amino acid sequences of the proteins have evolved to keep the protein molecules near the critical state,at which the proteins have maximum flexibility while maintaining structural integrity,thereby ensuring a balance between stability and adaptability.Revealing the critical behavior of the protein molecules is a challenging subject in the field of biophysics in recent years.In a recent study,people analyzed the correlation properties of the fluctuations of amino acid positions based on the data of the nuclear magnetic structures of natural proteins,and found that the correlation length of amino acids is scale-free,which shows that the natural proteins are in critical state from the perspective of structural dynamics.Considering that the above discussed critical behavior should be the result of the natural evolution of the amino acid sequences of proteins,and the number of the available sequences is much more abundant than the structural information,it is a more basic starting point and more challenging to reveal this critical behavior from the amino acid sequence level.Such kind of researches have not been reported yet.In this paper,we build a co-evolution amino acid network based on complex network method to reveal the features of critical behaviors of protein molecules from the amino acid sequence level.Meanwhile,through the analysis of a series of co-evolution amino acid networks of protein molecules with catalytic functions(i.e.,enzymes),this thesis revealed the correlation between the catalytic active site and the topological properties in the coevolution network,thereby providing a method for predicting enzyme catalytic sites based on amino acid sequences.The main work is summarized as follows:1.Statistical characteristics and critical behavior of protein amino acid networks based on co-evolution analysis.Through sequence alignment of amino acid sequences of the protein family,the correlation of amino acid mutations at different sites,namely coevolution information,was extracted.Then,based on the maximum entropy algorithm and the mean-field approximation,the direct coupling analysis on co-evolution information was performed to calculate the direct information that characterizes the direct coupling strength between amino acids.The co-evolution amino acid network was constructed by connecting the directly coupled residue pairs.Based on the co-evolution amino acid network,the statistical distributions of the edge weight and the correlation of the nodes were analyzed.The main results are summarized as following:1)The distribution of the edge weights of the network follows the power law distribution.This indicates that there are a few strongly related residue pairs in the network,and they play a vital role in the structure and function of proteins,by stabilizing the overall structure and effectively responding to the external disturbances with different intensity.2)Based on the analysis with a mechanical model and the analysis of the sequence correlation information,the long-range correlation characteristics of the coevolutionary network were studied.It was found that the correlation length increases with the system size monotonically,which suggests that the systems have no characteristics correlation length,therefore are scale-free.3)The weighted coevolution networks show fractal dimensions.The calculation results show that the information entropy shows a power function relationship with the size of the box.This shows that the system has fractal characteristics.The above three properties of the coevolutionary network are all typical characteristics of the system near the critical state.Therefore the above results illustrate from the sequence level that natural proteins are evolved close to the critical state.2.Topological features of the catalytic residues by coevolution analysis.Catalyzing the chemical reactions is one of the most typical functions performed by proteins,and these proteins are classified as enzymes.Predicting the catalytic sites of enzymes based on their sequence information is one of the goals that people strive to achieve.Our work analyzed in detail the correlation between the catalytic sites of a series of typical enzyme molecules and the topological features of the coevolutionary networks.The results showed that the nodes of the enzyme catalytic sites usually exhibit distinguished centrality properties,including degree centrality,betweenness centrality,closeness centrality and Laplacian centrality.This result provides a possibility to use co-evolution information to identify important functional sites of proteins.In addition,the structural properties of the nearest residues of the catalytic site on the coevolutionary network were analyzed,and it was found that the residues with strong coupling with the catalytic site usually have direct contact in space,and the coupling residues of different catalytic sites tend to form relatively independent modules.The results suggest that the pathways for regulating the functional dynamics of residues at different catalytic sites are independent.In summary,based on the complex network method,this thesis reveals the critical behavior of natural protein molecules from the sequence level,and establishes the correlation between the topological properties of coevolutionary network and the catalytic sites of enzymes.These results provide a new understanding of the sequence-structure-function relationship of protein molecules.
Keywords/Search Tags:complex networks, co-evolution, direct coupling analysis, Critical behavior, Enzymatic catalytic site
PDF Full Text Request
Related items