Font Size: a A A

Research On RNA-binding Protein Recognition Method Based On Multi-view And Multi-label

Posted on:2022-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:H T YangFull Text:PDF
GTID:2504306527482984Subject:Software engineering
Abstract/Summary:PDF Full Text Request
RNA-binding protein(RBP)is a general term for a class of proteins that accompany RNA to regulate metabolic processes and bind to RNA.One RBP may have multiple target RNAs,and its expression defects may cause multiple diseases.Searching for RBPs with similar functions and structures can help RNA therapy for cancer and other diseases.In the process of RBP recognition,a key step is to obtain the effective features of RNA and use the combined similarity network between RBPs to learn the connection between them.This paper proposes two new multi-view and multi-label feature learning strategies for the above-described RBP recognition.Compared with the existing methods of RNA sequence feature extraction and RBP recognition,the proposed algorithm has a great improvement in effectiveness.The following are the two main tasks of RBP identification based on RNA sequence:1)The first work is to propose a multi-view and multi-label RBP recognition algorithm based on deep feature learning.The algorithm first converts RNA sequences into amino acid sequences and dipeptide components,and uses one-hot coding and statistical techniques to construct initial multi-view features.Then use a convolutional neural network to extract multi-view depth features with small dimensions and high recognition.The CC multi-label classifier based on the voting mechanism is further used to construct the classification model of the multi-view depth features obtained upstream.Experimental research shows that the classification accuracy of the three views’ depth features extracted by convolutional neural network is more than 17% higher than that of traditional SVM,BP neural network and decision tree features.The classification effect of using the multi-view deep feature recognition algorithm is more than 5% higher than the average of the single-view deep feature.After adding the multi-label learning algorithm,the average accuracy is improved by more than 3%.2)The second work is to propose an optimal CC multi-label learning algorithm that combines multi-view and multi-label learning.Based on the above work,the RNA semantic view is added,and the dipeptide view is improved to the multi-gap dipeptide view.At the same time,the voting mechanism after multi-label learning is abandoned,and the multi-label feature learning algorithm is used to integrate multi-view feature learning and multi-label learning,which not only retains the recognition of multi-view deep features,but also improves the learning ability of the CC classifier for each label.The experimental results show that the depth feature classification accuracy of the two new views is 23% higher than that of the traditional feature extraction method on average.Compared with the integrated learning of the voting mechanism,the multi-label feature learning technology has a higher prediction effect by more than 5%.
Keywords/Search Tags:RNA sequence, RNA binding protein recognition, Multi-view, Multi-label, Deep learning
PDF Full Text Request
Related items