Font Size: a A A

Research On Prognosis Prediction Of Cancer Patients Based On Multiple Omics Data

Posted on:2018-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:Q ChangFull Text:PDF
GTID:2334330515498067Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
At present,the incidence of cancer is getting higher and higher.A precise prognosis prediction not only could help patients know about their survival expectation,but also help researchers understand the development process of the disease and guide clinical therapy.In this thesis,data mining technique was used to study the molecular data and clinical data of patients with GBM from The Cancer Genome Atlas project.The joint forecasting model that includes feature selection algorithm and classification algorithm is established,which could predict GBM prognosis with higher accuracy compared with the existing research results so as to predict the survival time of GBM patients to determine whether their survival time is more than 12 months.It helps stratify patients into different risk groups for more accurate treatment and accurate diagnosis.The main works of this thesis is as follows:(1)Data collection and preprocessing.The cancer that studied in this thesis is GBM and the molecular data and clinical data were derived from the TCGA database.And the data was preprocessed,the process includes:the definition and filling of the missing values,remove features or samples with too many missing values,standardization.(2)Based on the molecular data and clinical data of GBM patients,feature selection is done by using logical regression based on penalty term.Through the theoretical analysis and compared with the other three commonly used feature selection algorithms(tree-based feature selection,analysis of variance and recursive feature elimination based on logistic regression),the AUC scores predicted by the survival time of patients with glioblastoma multiforme were higher and the running time was shorter.(3)Based on the molecular data and clinical data of GBM patients,classification is done by using support vector machine algorithm.Through the theoretical analysis and compared with the other nine commonly used machine learning classification algorithms,the AUC scores predicted by the survival time of patients with glioblastoma multiforme were higher and the running time was shorter.(4)The joint forecasting model is obtained through combining the feature selection algorithm and the classification algorithm.The accuracy predicted by the survival time of GBM patients using the integrated model has been improved compared with the existing research results using similar data structures.
Keywords/Search Tags:Prognosis Prediction, Glioblastoma Multiforme, Logistic Regression, Support Vector Machine
PDF Full Text Request
Related items