Font Size: a A A

Prediction Of Protein-protein Interactions Based On Multi-information Fusion

Posted on:2020-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:X WuFull Text:PDF
GTID:2370330590952906Subject:Statistics
Abstract/Summary:PDF Full Text Request
In the era of big data,due to the rapid development of sequencing technology,biological experimental data and information have grown exponentially,and massive biological data continuously appears.It is a challenging task to select and use efficient machine learning methods to predict protein-protein interactions in proteomics research.Predictive research on the protein-protein interactions can help humans to understand the inherent nature and rules of life activities,and also play a driving role in understanding the mechanism of disease action and developing effective drugs.Focusing on the topic of protein-protein interactions prediction based on multi-information fusion,the main work of this paper is as follows:1.We propose a new method for p protein-protein interactions prediction based on PPIs-stacking.First,PseAAC,ACF,ACC-PSSM,DPC-PSSM and CT are employed for feature extraction and the protein pairs are concatenated to obtain the initail feature vectors on H.pylori and S.cerevisiae datasets.Secondly,the Lasso method is used to reduce the dimension of the feature vectors.Finally,the optimal feature vectors are input into the stacking ensemble classifier for prediction via 5-fold cross-validation.We use four independent test sets of Celeg,Ecoli,Hsapi and Mmusc to verify the model,which achieve high accuracy.The experimental results show that the proposed PPIs-stacking method for protein-protein interactions prediction in this paper has achieved good prediction results.2.We propose a new method PPIs-WDSVM for protein-protein interactions prediction by integrating various feature information of protein sequences.First,we use PseAAC,AC and EBGW to extract the feature of the protein sequence,and then fuse the extracted three sets of feature vectors.Secondly,the two-dimensional wavelet denoising of the fused protein feature vector is performed,and finally the noise reduction is performed.The feature vector is input to the SVM classifier for prediction.5-fold cross-validation have indicated that the prediction accuracy achieved 95.97%and 95.55% for the protein interactions on the H.pylori and S.cerevisiae datasets,respectively.Compared with the other methods,our method PPIs-WDSVM can effectively improve the predictive performance of protein interactions.
Keywords/Search Tags:pseudo-amino acid composition, two-dimensional wavelet denoising, lasso dimension reduction, support vector machine, stacking ensemble classifier, protein-protein interactions
PDF Full Text Request
Related items