Research On Cancer Gene Survival Analysis Based On Multi-task Learning Model

Posted on:2019-07-06

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Li

Full Text:PDF

GTID:2370330563985407

Subject:Master of Engineering degree

Abstract/Summary:

PDF Full Text Request

Survival analysis is a popular branch of statistics.It is a kind of statistical analysis method for analyzing survival phenomena and response time data and their laws.Survival analysis research is usually based on the research of things to establish a corresponding model,through the data characteristics of survival time prediction and systematic analysis,in medicine,bio-pharmaceuticals,commercial and industrial applications are widely used.However,when collecting clinical case data,many algorithms cannot be used due to the effect of censored data.The use of models such as the Cox proportional hazards model or the parametric regression model requires some rigorous assumptions about the data.This practice destroys the original nature of things,and research on practical problems is very inappropriate.In order to solve the above two limitations,this paper uses a multi-task learning model for cancer gene survival analysis.This algorithm is an inductive migration learning method.It can share characterizations between related tasks,fully obtain the information in the censored data features,and use the specific domain information implicit in each feature to improve the generalization ability of the model.A better overview of the original task solves the problem that other survival analysis algorithms cannot use censored data for training.At the same time,when using multi-task learning,there is no need to make any additional assumptions about the original problem.Modeling the complete problem greatly increases the prediction accuracy.The research focus of this article is mainly on how to transform the prediction of survival analysis into using multi-task learning algorithm to analyze the modeling and improve the prediction performance of survival analysis.The research work is divided into the following sections:(1)Basic technology research and related algorithms.This paper analyzes in detail the processing of censored data by relevant algorithms in the domain.Through algorithm analysis,modeling,experiment and analysis,this paper systematically analyzes the differences between different algorithms for handling censored data,and provides a theoretical basis for the construction of text models.in accordance with.(2)Basic model and optimization algorithm selection.A very important research goal of this paper is how to fully use the information in the censored data to improve the prediction accuracy of the model.Using the multi-tasking learning model as a basic model can share characterization between related tasks,and use the implicit specific domain information to improve the generalization ability of the model.At the same time,the matrix norm penalty term and the alternating direction multiplier algorithm are also introduced to solve the overfitting problem of the model.(3)Model construction and improvement.In this paper,the survival time prediction problem in survival analysis is transformed into a classical binary classification regression problem.A new objective function is used to solve the post-transition regression problem.The alternative direction multiplier algorithm introduced in solving the convex optimization problem is optimized and solved.Finally,the convergence and time complexity of the model are analyzed and summarized.(4)Analysis of experimental results.The datasets used in this paper are several mainstream high-dimensional cancer gene expression survival datasets.Experiments were performed using this algorithm and several common survival analysis algorithms.C-index and AUC values were used as evaluation indicators.The model's scalability was also verified.

Keywords/Search Tags:

Cancer gene, Multi-tasking learning, ADMM, Survival analysis

PDF Full Text Request

Related items

1	Research On Multi-task Learning Models For Survival Analysis With High-Dimensional Censored Data
2	Applicable To Distributed Multi-tasking Measurement System Of Navigation Observation Platform
3	Study Of The Cancer Driver Gene Identification Based On The Genetic Network Modelling
4	Co-expression Network Analysis Of Egfr In Pancreatic Cancer
5	Analysis Of SMOX Gene Expression In Liver Cancer Based On TCGA And GEPIA Database And Its Clinical Significance
6	Distributed Matrix Factorization Via ADMM
7	Research On Survival Analysis Of Pathological Images Based On Deep Learning Method
8	Bayesian Survival Analysis For Breast Cancer
9	Identification Of Potential Hub Genes Associated With The Pathogenesis And Prognosis Of Pancreatic Duct Adenocarcinoma Using Bioinformatics Meta?analysis Of Multi?platform Datasets
10	New methods for variable selection with applications to survival analysis and statistical redundancy analysis using gene expression data