Font Size: a A A

Research On Essential Protein Identification Algorithm Based On The Weighted Protein Interaction Networks

Posted on:2014-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:S M LiuFull Text:PDF
GTID:2268330425983898Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the boost of proteomics research, as well as the rapid development ofhigh-throughput experimental techniques, the research of essential proteins inproteomics has entered a new stage of development. Essential proteins are thosewhich are necessary for viability and reproduction of an organism. They play pivotalroles in biological processes. Identification of essential proteins not only helps tounderstand the process of cell metabolism, growth, differentiation, apoptosis andother biological activities, but also has significant value in disease mechanism study,drug targets seeking and new drugs development. With the constant growth of proteininteraction network data, the topic identifying essential proteins based on networktopology has gained extensive attention. However, due to some reasons such asimperfect network data or high false-positive, the existing algorithm for identifyingessential proteins has the problem of low accuracy.On the basis of the protein interaction network topology, the paper puts thebiological function and biological properties of the network nodes into consideration.We cite data on gene ontology and gene expression to build the weighted proteininteraction network. The identification of essential proteins is base on the localnetwork topology on the weighted network of high quality. Several aspects areincluded:For the problem of high false positive, we select gene ontology information tomeasure the protein interaction on functional similarity, attach different weight toeach edge of the network, so as to build a weighted network. The different effects ofthe immediate neighbors and indirect neighbors of protein nodes are considered here.It extends local area to the second-order neighbor and proposes a new algorithmGO_ELAC to identify essential proteins based on the nodes and edges of the dualcharacteristics. The experimental result shows that the algorithm compared with theother five methods can identify more essential proteins and significantly improve theaccuracy.Protein essentiality is actually a functional attribute, but most algorithms basedon network topology have no further study on biological significance and function. Sodatasets of gene ontology, gene expression and protein interaction networks areintroduced to identify essential proteins. First, the Pearson correlation coefficient which is based on gene expression data corresponding to the interaction of proteins isapplied to filter the edge of protein interaction network. Then, the gene ontologydatasets are introduced to measure the functional similarity of protein interactions.Combined with the expression correlation of protein interaction, the dual-weightsnetwork is built. Taking the characteristics of the nodes and edges in the dual weightsnetwork into account, an improved algorithm PeGO for identifying essential proteinsis born. Experiments are carried out to test the PeGO algorithm experimentalperformance on the yeast datasets. It shows that, PeGO algorithm compared with theother six methods has a higher accuracy in identifying essential proteins. It furtherconfirms the feasibility and validity of the identifying essential proteins based on theweighted network by introducing protein bioinformatics and constructing highcredible weighted protein interaction network.
Keywords/Search Tags:Essential protein, Protein interaction networks, Weighted network, Network topology, Gene ontology, Gene expression profile
PDF Full Text Request
Related items