Font Size: a A A

On Identification Of Essential Protein Based On Node's Topological Parameters In Protein Networks

Posted on:2009-08-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:H B HuangFull Text:PDF
GTID:1118360278957310Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Recent research discovers that the essentiality of a protein molecule in function is correlated with its topological properties in a protein network. The study of essential protein (EP) is no only meaningful in the understanding of the organization and process of life activities, but also important in diagnosis-treatment and medicament design. Comparing with biological experiments and other methods, bioinformatics methods based on topological structure possess particular advantage in the identification of EP. However, there is still some dissatisfaction in the identification. To improve the performance, two ways are proposed in this paper: one is to find new parameters closer to EP, the other is to integrate the EP information from known parameters. For the former, the minimal vertex cover (MVC) is introduced to get a new topological parameter - vertex cover parameter (VP) for the first time. For the later, the combination of known parameters is explored.The exact solution to a minimal vertex cover problem (VCP) can be found within the frame of the theory of parameterized computation. However, there are still some limits in theory and practical. In this paper, the theory is introduced into the realm of random graph, so as to use statistic properties and probability methods to deal with a VCP wholly and uncover the inherence and evolvement laws of the kernel and the degree distribution in the kernelization by low (1 and 2) degree vertex (1-DV and 2-DV). On the base of fixed-parameter tractability, the d-decidable by way of kernelization (d-DBK) of the parameterized VCP of random graph is proposed.In the study of 1-DV kernelization (1-DVK), the mapping of 1-DV into the other nodes in a random graph is analyzed, and their adjacency relationship is quantified. It is found in this paper that the strength of 1-DVK gets its maximum whenω≤2.3 withωbeing the average degree of the graph, and the 1-DBK of the parameterized VCP of the graph is presented. In the study of 2-DV kernelization (2-DVK), the counting method for 2-DV triangle subnetworks is analyzed and the situation that a node is shared by more than one of the subnetworks is studied. After that, the dynamic and evolvement mechanism of the kernel and the degree distribution in the process of 2-DVK are described. Accordingly, two deductions are obtained: one is that the strength of 2-DVK gets its maximum in a random graph when the probability of its 2-DV is about 0.75, and the other is that the parameterized VCP (G, k) of a random graph given byφ(x) is 2-DBK when k smaller than a given value relation toφ(x).For its important role in the topology of a network, the minimal vertex cover (MVC) is introduced into the study of EP in a protein network and VP is proposed as a new topological parameter. To avoid the NP-hard which probable met in the calculation of the parameter by exact methods, the kernelization by low degree nodes and the combination of exact and no-exact algorithms are studied according to the sparseness of protein networks with lots of◇,△2 and∧2 subnetworks. Then, a quick algorithm (A-Q algorithm) based on randomized kernelization is presented. The result that the identifying degree of VP generated from the algorithm is better than those of the existed parameters obviously in the identification of EP, gives an evidence to approve the importance of the parameter in describing node's topological characteristic.The identification of EP is considered as a special kind of pattern recognition by setting up the quantification of nodes' (proteins') relationship -topological parameters as its ground. The correlation between a protein's essentiality and its main topological parameters, including VP, is analyzed and so is the nature of the essential-node-judgment of the parameters. In order to study the mutual complement of the EP information from different single parameters (SPs), the relation between the identification degree of a combined parameter (IDCP) and those of its SPs is presented theoretically, so is the relation between IDCP and the correlations of its SPs. With the above observation, the integration of the EP information from different SPs is explored and a combination method is proposed to get combined parameters. After that, an asynchronous recognition algorithm is developed, and the experiment results show that the identifying ability of the technique is greater than those of the others obviously in the identification of EP.
Keywords/Search Tags:essential protein, pattern recognition, protein networks, topological parameter, parameterized computation
PDF Full Text Request
Related items