Font Size: a A A

Research On Multi-task Learning Models For Survival Analysis With High-Dimensional Censored Data

Posted on:2021-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:K ShaoFull Text:PDF
GTID:2370330614465968Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Survival analysis aims to predict the time of an event of interest,which has been widely applied in survival status prediction of patients in clinical treatment and running time prediction of mechanical systems in fault diagnosis.Recently,with the rapid progress of various information collection technologies and big data technologies,high-dimensional data frequently appears in various types of survival analysis problems.The prediction accuracy has become a challenge that cannot be ignored.Researchers have proposed and improved many prediction models.However,some of these models have assumptions that are too strict,or do not fully take into account the a priori information inherent in high-dimensional small sample data,and cannot achieve satisfactory results in practical applications.Besides that,there exist various unknown noises in the collected data,which will also affect prediction accuracy.To overcome these shortcomings,this paper introduces matrix completion theory,models the original survival analysis problem as the multi-task transductive matrix completion and advances two survival analysis prediction models.The main contributions of this thesis are presented as follows:1.Aiming at the inherent problem of incomplete labels in high-dimensional censored data and the over-fitting defects caused by high dimensional small-sample-size issue,a prior information guided transductive matrix completion for survival analysis is proposed.This model can not only leverage the censored instances as an effective supplement to the limited number of uncensored instances,but also can simultaneously explore feature distributions of both the training samples and the testing samples in the training stage to improve the prediction performance.Furthermore,we introduce the multi-task transductive feature selection scheme into the MTMC model to alleviate overfitting issue caused by the high dimension and small-sample-size.Experimental results performed on several real datasets show the effectiveness of the proposed model in terms of the Cindex and weighted average AUC metrics.2.Aiming at the problem of noise sensitivity issue caused by complex environments,a noisetolerant weakly supervised transductive matrix completion for survival analysis is proposed.By introducing the Mixture of Gaussians distribution to fit the unknown complex noise in the complex environment in practical applications,the noise sensitivity can be alleviated.Furthermore,an efficient Expectation-Maximization-like optimization algorithm is designed to solve the proposed WSTMC model.To find the optimal parameter values,a bayesian optimization method was used to adaptively select the model parameters.Finally,the experimental results conducted on several real microarray gene expression datasets verify that our proposed WSTMC model can outperform widely-used competing methods.
Keywords/Search Tags:Survival Analysis, Matrix Completion, Noise-tolerant, Multi-task learning, Transductive Feature Selection
PDF Full Text Request
Related items