Font Size: a A A

Prediction Research Of Protein Interaction Based On Weighted Feature Fusion

Posted on:2017-10-06Degree:MasterType:Thesis
Country:ChinaCandidate:B W ZhangFull Text:PDF
GTID:2310330485956925Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Proteins are involved in various biological processes,such as metabolic cycles,DNA transcription and replication and signaling cascades.Usually,protein performs its function by forming a huge network of protein-protein interactions with other proteins.PPIs research has very important practical significance in clinic researches,and the research would improve our understanding of mechanisms of human diseases and provide the basis for new therapeutic approaches.In the past decades,large-scale experimental technologies and other highthroughput biological techniques for PPIs detection have been developed.However,these methods are costly and time consuming,and usually suffer from high rates of both false positive and false negative predictions.Hence,it is of great practical significance to develop reliable computational methods to facilitate PPIs prediction.The first step of computational prediction for PPIs is features extraction.The protein features roughly include amino acid sequence information,structure information,evolutionary information,domain information,sub-cellular localization information and so on.The sequence information is the easiest to obtain,it determines the protein's structure information,and the structure information determines the nature of the protein function.Now,the research of PPIs prediction based on sequence information is relatively mature.But,single feature information would not reflect the performance of protein,and this may influence the accuracy of PPIs prediction.In this paper,we hope to improve the accuracy of protein interaction prediction by combining different protein characteristics that could fully reflect the performance of each protein.The other two important steps of the computational prediction methods for PPIs are feature selection and classification.The commonly used feature selection methods include principal component analysis(PCA),laplacian eigenmaps(LE),linear discriminant analysis(LDA),and maximum margin criterion(MMC).The widely used classification algorithms include: random forests(RF),K neighbor(KNN),support vector machine(SVM).When facing so many algorithms,how to combine the feature selection algorithm with classification algorithm can to get better classification accuracy?In this paper,we propose a weighted feature fusion method based on amino acid features and evolutionary information.We select MMC to perform high-dimension reduction to maximize the ratio of between-class.Finally,SVM is employed for classification.In order to verify the effectiveness of this method,we make a lot of comparison experiments.We choose PCA algorithm for comparison with MMC algorithm,and choose RF,KNN for comparison with SVM respectively.The results show that the fusion feature can bemore effective than the single feature of protein;the classification of the MMC algorithm is more reliable and accurately than PCA;among the classifiers,the SVM classifier has a certain advantage in the protein interaction prediction.
Keywords/Search Tags:Protein-Protein Interaction, The Amino Acids Feature, The Evolutionary Information Feature, Feature Weighting Fusion, MMC, SVM
PDF Full Text Request
Related items