Font Size: a A A

Research On Representation Learning Method For Transcriptome Data

Posted on:2022-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2480306731977649Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The rapid accumulation of transcriptome data provides new insights into the prediction of biological and pathological behavior.However,due to the large size of these data sets,complete transcriptome data analysis is difficult and inefficient.Therefore,if we were to implement a classification model on this data,it would be time-consuming and costly.This requires some initial preprocessing steps to reduce the dimension of the dataset without losing information.Extracting meaningful or critical features from complex transcriptome data can reduce the dimension and complexity of transcriptome data and extract critical features to predict biological and pathological behavior,such as tumor primary site prediction.The reduced dimensional data representation can improve the performance of the model and can be used for biomarker discovery,sample classification,and disease process interpretation,paving the way for precision medicine.Because of the above problems,this topic has mainly done the following three aspects of work:1.LINCS transcriptome data governance pipeline Lincs-extract.In view of the serious imbalance of experimental categories of Lincs transcriptome data,this chapter proposes Lincs-Extract,a Lincs transcriptome data governance pipeline,to select the data that meets the needs of this project through statistical analysis of Lincs data sets.2.Deep representation learning model based on autoencoder: TDRL.Because of the high dimension,high complexity,and high noise of Lincs transcriptome data set,a transcriptome data representation model--TDRL model is proposed in this chapter.The TDRL model is based on the new multi-channel autoencoder(MCAE)proposed in this chapter combined with the new loss function decol?loss proposed by us,and in the training process,the learning rate adopts warmup and exponential attenuation strategy.3.Prediction model for primary tumor sites based on transcriptome data: DTPS.Because of the relatively poor performance and limited sample availability of current tumor primary site prediction methods,a deep learning prediction model of tumor primary site based on transcriptomics data--DTPS model is proposed in this chapter.The combination of DTPS model with TDRL model transcriptome data can be used directly for deep learning-based prediction of tumor primary site with only fine-tuning.
Keywords/Search Tags:Representation learning, Prediction of tumor primary site, Gene expression profile, Deep learning
PDF Full Text Request
Related items