Font Size: a A A

Tumor Subtype Discovery Based On Cancer Genomics Data

Posted on:2021-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhangFull Text:PDF
GTID:2404330620965626Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
At present,cancer has become one of the diseases that seriously endanger people's lives.It is very necessary to study the occurrence of cancer.Because of the rapid progress of highthroughput sequencing technology,different types of tumor genomic biological data are constantly pouring out,which brings new opportunities for the analysis of cancer pathogenesis.Clinical studies have found that cancer patients with the same pathological characteristics have significantly different prognosis for the same treatment regimen,so further discovery of cancer subtypes has important research value.One of the effective ways to discover cancer subtypes is based on clustering algorithm,which can make full use of and integrate multi-omics biological data to identify tumor subtypes that are closely related to clinical.This thesis mainly focus on the clustering algorithm for the molecular data of cancer genome.The main research work is as follows:The first method is an integrated clustering method based on the similarity between clusters and the weight of patients.Firstly,we generate many different basic clusters by randomly transforming two parameters in the scaling index similarity kernel function,and calculate the weight of each sample according to the difficulty of clustering in the basic cluster.Then the weight of the samples is integrated into the similarity calculation process of each sample cluster.And finally a graph-based method is used to obtain the final clustering result.The second method is based on the subspace adaptive alignment clustering algorithm.Firstly,we extract the subspace from the similarity network of each omics data,which can map the high-dimensional multi-omics data to the low-dimensional subspace containing the main biological information.We then use an adaptive optimization algorithm to merge the subspaces of the omics data to obtain the fused subspace.Finally,the k-means algorithm is applied to the extracted fusion subspace to get the sample subtype results.The experimental results on labeled datasets and clinical cancer datasets show that our two algorithms are more effective than the existing methods.
Keywords/Search Tags:Multi-omics data, Cancer subtype, Integrated clustering algorithm, Subspace adaptive optimization clustering
PDF Full Text Request
Related items