Font Size: a A A

Protein Function Prediction By Functional Interrelationship And Weighted Co-expression Network Analysis

Posted on:2018-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:Jael Sanyanda WekesaFull Text:PDF
GTID:2310330536460941Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The growing disparity between known and experimentally annotated proteins triggers the need for progress in the functional characterization of the large compendium of newly sequenced proteins.The data sources used for prediction of protein functions include proteinprotein interactions,genomic context,protein structure,mining information from biological literature and from gene expression and transcription factor binding.Functional information is derived from sequence similarity,structural similarity and interaction between proteins(or genes).Protein-protein interaction(PPI)has more information for protein function prediction tasks than that which can be obtained from sequence or structure similarity.PPI network is generated from interactions between nodes(proteins)which are weighted hence more information can be elucidated.Interacting proteins take part in the same biological process and interconnected proteins operate in the same pathway.Therefore,integration of PPI and other functional association networks from heterogeneous data sources is vital in the quest for accurate prediction of protein functions.This thesis explores gene expression and protein-protein interaction data in relation to functional similarity.Combining similarities from different sources has been reported to guarantee accuracy of functional annotations.Moreover,integration of genomic and proteomic data widens the prediction coverage and improve prediction accuracy.The main purpose of the proposed method is to enhance the performance of classifier integration for protein function prediction using transductive learning on a bi-relational graph.Most existing methods have computational complexity due to integration of heterogeneous data by combining the networks to create a composite network.In our method,vector scores of the different independent data sources are integrated to obtain a comprehensive score used to obtain final annotations.Moreover,we incorporate semantic similarity into the functional interrelationship to boost accuracy of our method.Enrichment analysis is then done to confirm functional significance and validation of our predicted annotations.The frameworks implemented in this dissertation are scalable and flexible in understanding context-focused subnetworks,facilitate analysis of interaction networks and predict novel protein functions.Experimental results demonstrate the effectiveness and efficiency of our methods in comparison to existing methods on multi-sources datasets in yeast,human and mouse benchmarks.
Keywords/Search Tags:Protein function prediction, Transductive learning, Gene expression profile, Multi-label classification
PDF Full Text Request
Related items