Font Size: a A A

Research On Several Issues Of Essential Protein Prediction In Protein-Protein Interaction Network

Posted on:2021-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:L Y MaFull Text:PDF
GTID:2370330602975163Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Essential proteins are the main components of the physiological metabolic pathways of cells and play an important role in the development and survival of living organisms.The prediction of essential proteins helps us understand the basic needs of cells and discover potential drug targets.With the rapid development of high throughput experimental technology,the accumulation of protein-protein interaction(PPI)data is very rich,which provides the possibility for essential protein prediction.In recent years,the research on the prediction of essential proteins based on PPI networks is receiving attention and becoming a new research hotspot.Traditional biological experimental methods are very costly,therefore,many researchers use computational methods to predict essential proteins.However,the existing prediction methods still have the following problems:(1)There are a large number of false positive data in PPI data;(2)PPI is dynamic and uncertain,and will change according to different environments.The existence of the above problems reduces the accuracy of essential protein predictions.Focusing on the above two problems,this thesis deeply explores the PPI network topological features and multi-source biological information features,and designs effective essential protein prediction methods.The main work of this thesis is as follows:(1)In order to solve the false positives problem in PPI data,we proposed an essential protein prediction method--DEP-MSB.It combined network topology characteristics and multi-source biological information.This method firstly defined the initial importance score for the proteins in the network,and then used the network topological features and the biological properties of the proteins to construct the attribute matrix and weight matrix of the vertices,and modified the importance score of each protein in the process of Markov random walk.At the same time,the algorithm is optimized using the gradient descent method.Finally,the proteins are sorted in descending order according to the importance score,and the top k is regarded as the predicted essential protein.To demonstrate the performance of the DEP-MSB algorithm,we performed a series of experiments on three different yeast PPI datasets,namely:DIP,MIPS,and MBD.Experiments have verified the effectiveness of the DEP-MSB algorithm in predicting essential proteins.Compared with other state-of-the-art essential protein prediction methods,DEP-MSB algorithm has better performance on various evaluation criteria.(2)In view of the characteristic that protein-protein interactions are mostly dynamic and instantaneous,we proposed the essential protein prediction algorithm--IEP-PCD,which based on protein complex and biological information on dynamic weighted PPI network.Firstly,we adopted the 3-sigma method and gene expression data to construct the dynamic weighted protein-protein interaction network.Then,based on the premise that essential proteins are more likely to be present in protein complexes and work together,we employed core-attachment structures to predict protein complexes in the constructed network.Finally,we predicted essential proteins by integrating protein complexes and biological information.The experimental results show that in the DIP PPI data set,the IEP-PCD algorithm is more accurate than other methods for predicting protein complexes.Moreover,the method can predict more essential proteins.(3)Because of the uncertainty in the protein-protein interactions,we proposed an essential protein prediction algorithm called ETB--UPPI based on the uncertain protein-protein interaction network.First,the Simrank problem in the uncertain PPI network was transformed into the Simrank calculation of the deterministic PPI network.Then,Simrank method was used to calculate the protein similarity in PPI network.Finally,Simrank similarity score,GO similarity,Pearson correlation coefficient and subcellular localization information were combined to measure the importance of proteins by a score function.Through experiments on four real yeast PPI datasets,the results have shown that the ETB-UPPI algorithm has obtained better prediction results.
Keywords/Search Tags:protein-protein interaction network, essential protein, protein complex, biological information
PDF Full Text Request
Related items