Font Size: a A A

Prediction Of Functional Microexons Based On Transfer Learning

Posted on:2019-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:H K GaoFull Text:PDF
GTID:2370330548993133Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Alternative splicing imparts higher complexity to proteome and transcriptomes,which makes limited length of encoded protein sequences can produce a wide variety of proteins with different structures and functions.Numerous studies have indicated that there is an inextricable link between alternative splicing and disease.A thorough understanding of the regulatory mechanisms of alternative splicing in exons is considered as the key to overcoming many diseases.However,due to limitations of sequencing techniques and sequence analysis software,the vast majority of researchers have focused their eyes on longer exons,which makes shorter exons(herein referred to as microexons)have not been given the appropriate attention.Until the publication of two heavyweight articles on cell and Genome Research in recent years,researchers realized that the number of microexons is quite large now.The authors point out that microexons exhibit higher sequence conservation and stronger regulation than longer exons,and they can influence nervous system formation by regulating the protein's interaction domain.However,at present,the database of functional microexons has not been established.Although there has a large number of microexons,its functionality can not be judged effectively.For the status quo of microexons,this article considers the prediction of functional microexons as the research goal.First of all,the features of microexons were selected and analyzed.The features were divided into the features of gene level and protein level.Then,the clustering algorithm was used to select the better classification features.Thirdly,due to the labels of micro-indels data was reliable but the labels of microexons is not easy to get,micro-indel data features are extraction to predict the labels of microexons by using transfer learning,mapping the micro-indel data and microexon data to a common low-latitude space.In this space,the distribution of the two datasets is almost same,the data of micro-indels and microexons are then trained and predicted by machine learning method at this new dimension.Finally,a good model of predicting functional microexons was obtained.By analysis of the functional and neutral microexons data predicted by the above method,it is found that among the results obtained by transfer learning and machine learning,microexons which predicted to be functional tend to have secondary structure,and more prone to be in protein domains,the conservation score is higher.This is consistent with previous research results.This proves that the method of this study is effective.In the meantime,this article also validates the pathogenicity of a number of microexons in manyliteratures and then predicts them.The predicted results are the same as the actual results.This example also proves the effectiveness of this method.
Keywords/Search Tags:Alternative splicing, microexon, micro-indel, transfer learning, machine learning
PDF Full Text Request
Related items