Font Size: a A A

Research On Algorithm Of Identifying Essential Proteins Based On Multi-biometrics

Posted on:2015-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:J WuFull Text:PDF
GTID:2370330488999633Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Essential proteins help people understand life processes of organisms,such as metabolism,differentiation and so on.They are indispensable for life activities.Essential proteins provide valuable information for the development of biology and medical research from the system level.In post genome era,protein-protein interactions data increase exponentially,and this opens a window for identifying essential proteins from protein interaction network.However,the interactions in the protein interaction networks are still incomplete and contain false positive data.The accuracy of topological centrality based algorithms is deeply affected by this noise data and missing data.Therefore,it is a challenge for improving the prediction precision of essential proteins.To reduce the negative effects of missing data and false positive data in the network,multi-dimensional biometrics are used to identify essential proteins in this paper.We mainly base on the functional importance of proteins to identify essential proteins.To some extent,we reduced the dependence on the network and improved the prediction precision.Essential proteins are said to densely distribute in some protein complexes.Protein complexes with high co-expression level often evolve slowly,and this is coinciding with essential proteins.Essential proteins are closely related to highly co-expressed protein complexes.Therefore,CED proposed in this paper is based on co-expression level of protein complex,gene expression level and edge clustering co-efficient.It is the combination of biological features and topological property.CED and several existing algorithms are applied on yeast protein network.Experimental results show that CED identifies more essential proteins than other algorithms both on DIP network and BioGRID network.In order to improve the prediction precision of essential proteins further,biology functional features which represent proteins more directly and efficiently are needed to be further explored.We observed that proteins in the same motif tend to show similar functions and evolve at a similar rate,and motif is said to evolve conserved.Therefore,a new algorithm for predicting essential proteins based on motif,gene expression and protein complex is proposed.It is named after MGC.MGC and six previously proposed algorithms are applied on yeast protein interaction networks which are downloaded from DIP database and MIPS database.Jackknife methodology,F-measure et al.are the means used to validate the effectiveness of MGC.Experimental results show that the performance of MGC is better than other six algorithms.
Keywords/Search Tags:Essential protein, Protein-protein network, Topological characteristics, Protein complex, Motif, Gene expression
PDF Full Text Request
Related items