Font Size: a A A

Research And Application Of Essential Protein Identification Method Based On PPI Network

Posted on:2018-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:H Y HongFull Text:PDF
GTID:2310330515956856Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Protein is an important component of the organization and the composition of all human cells.Furthermore,it is also the material basis of life.All important components of the organism need to be involved in the protein.The essential protein generally refers to the deletion by gene knockout mutations that result in the loss of organism-related biological function,and leads to its inability to survive or pathogenicity of the protein.Essential proteins play an important role in the survival and breeding of offspring.The removal of essential proteins from living organisms can lead to the inability of the organism to survive or lose its normal function.Therefore,the identification of essential proteins has become a vital part of life sciences.In thisthesis,We put forward the essential proteins prediction algorithms based on the protein-protein interaction(PPI)network topology and the fusion of multi-source biological information.The main contents include the following three aspects:(1)An improved PageRank algorithm called EPP(Essential Proteins Predict)is proposed to identify essentail proteins.The algorithm considers the PPI network as an uncertain network whose verticeswith attributes.Then it ranks the importance of the vertices in PPI network and take top P percent vertices as essential ones.This method first needs to calculate the similarity between vertices.For the calculation of similarity,we consider the reliability and semantic similarity information of the protein.Secondly,we cansider neighbor information for each vertex in the PPI network,that is,calculate its neighborhood similarity;Finally,we compute each vertex's importance by the use of the above two similarities.The algorithm proposed in this thesis considers the both topological information and the biological information,so it has the advantages of low complexity and high recognition accuracy.The experimental results on the standard data sets show that the proposed algorithm can more accurately identify more essential proteins.(2)An improved PSO algorithm called EPPSO(Essential Protein PSO)is proposed to identify essential proteins.In this algorithm,we propose a measure of the overall top-p essential protein indicators,rather than assessing the protein essentiality to a single indicator.Our algorithm uses a candidate solutionwhich contains P protein.We measure the essentiality of the overall P proteins.We use the correlation closeness between these proteins and other proteins to measure overall proteins essentiality,that is the fitness function.Then according to the idea of particle swarm algorithm,the function is updated by tracking the global and individual optimal values.In order to evaluate the performance of the algorithm,we run the algorithm on some standard datasets,such as yeast dataset.The experimental results show that the algorithm has better recognition accuracy than other methods.In addition,because the algorithm only needs to identify P essential proteins,it is not necessary to calculate the essential of each protein according to an index,so it has lower computational complexity.(3)On the basis of the above work,this thesis designs an essential protein identification system based on WEB by using the two essential protein prediction algorithms mentioned above.The system can predict the results through the graphical way reflected in the system,so it is convenient and efficient.The test shows the system is stableand with beautiful interfacewhich has a good economic value and social value.
Keywords/Search Tags:Essential protein, PageRank, PSO
PDF Full Text Request
Related items