Font Size: a A A

Subcellular Localization Prediction For RNAs And Proteins Based On Machine Learning And Deep Learning

Posted on:2020-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z CaoFull Text:PDF
GTID:2370330623463584Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Subcellular location prediction of biomacromolecules has been a hot topic in the field of bioinformatics.Research progress of subcellular localization prediction for protein and RNA is widely concerned.Many related studies in recent years have shown that the subcellular localization of RNAs and proteins is important for understanding the functions of RNAs and proteins,the interactions between RNAs,the interactions between proteins,the interaction between RNAs and proteins,and the detection of drug binding motif.However,it is often laborious and costly to identify a protein's or RNA's cellular compartment using wet-lab experiments,it is an urgent and meaningful work to predict the subcellular localization information of biological macromolecules based on computational methods.The thesis is mainly concerned with the prediction of subcellular location of long non-coding RNAs(lncRNAs)and plant proteins.In the first work,we firstly proposed a subcellular location prediction algorithm for lncRNAs sequence.In the second work,we proposed the Plant-mPLoc3.0 prediction algorithm based on the gene ontology features and conserved domain features,and improved the performance of the previous subcellular location prediction of plant proteins.The main contributions of this paper are summarized as follows:(1)Constructed a subcellular localization data sets for longnon-coding RNAs.(2)Mining the high-level abstract features from the original sequence features based on the stacked autoencoder.(3)Mitigate the imbalance distribution in the dataset based on the supervised upsampling algorithms.(4)Modeling the integration of different classifiers' output using a fully connected DNN network.(5)Build a oneline website for long non-coding RNAs.(6)Exploring the application of word vector encoding method on RNA subcellular localization prediction.(7)Constructed a subcellular localization data sets for plant proteins.(8)Create a whole species gene ontology feature similarity matrix and plant protein specific conserved domain feature similarity matrix to predict the subcellular location of plant proteins.(9)Mining the correlation between the class labels and optimizing the prediction results based on dynamic threshold criteria.(10)Established a subcellular location prediction online website for plant proteins and local software.
Keywords/Search Tags:subcellular localization prediction, machine learning, stacked autoencoder, deep learning, ensemble learning
PDF Full Text Request
Related items