| Cancer is an important threat to human health with physiological complexity and heterogeneity.With the development of high-throughput sequencing technology,cancer subtyping study using multi-omics data is an important research direction.Although some studies are using deep learning and statistics to integrate multi-omics data,more efficient integration methods are still lacking due to the highly unbalanced data dimension and scale differences between multi-omics data and the high noise of bioinformatics data.To address the above problems,this paper proposes three multi-omics integration subtyping methods to research multi-omics integration subtyping methods progressively.First is the deep network multi-omics integration method.This method uses a deep autoencoder to learn the representation of multi-omics data and outputs the representation of multi-omics data using the bottleneck layer in the autoencoder.The cancer subgroups are obtained by clustering the compressed features obtained from the bottleneck layer using the K-means algorithm.Limited by the inconsistency of the optimization goals of both autoencoder integration and clustering tasks,this paper then introduces the idea of deep clustering based on deep autoencoder integration and proposes a joint optimized feature-enhanced deep cancer subtyping method.After end-to-end pre-training using the autoencoder,the encoder model is plugged into a single-layer classification network to become a classifier.Every single training process first goes through the encoder part to obtain compressed features,and then K-means clustering is used to obtain pseudo-labels for supervised training of the classification network.This approach enhances the overall clustering ability of the model.Finally,to further integrate the clustering iteration process and network update,a deep cancer subtyping method based on memory feature enhancement is proposed,introducing the centroids memory and samples memory to reduce the instability of the joint optimal clustering process,and using a delayed classification strategy to avoid classification crashes.The three methods in this paper can discover cancer subtypes with more significant survival differences compared with previous studies using deep learning integration while avoiding using survival data to select features and reducing the reliance on clinical data.The three methods proposed in this paper are fully validated on artificial simulation datasets and TCGA real cancer datasets and achieve optimal results compared with the fusion model of high-order path similarity network(HOPES),similarity network fusion(SNF),i Cluster Plus,mo Cluster,and Clusternomics.The results showed that the clustering mining ability was strong in multi-omics data and had good generalizability to multiple cancers.It is also shown that with the joint optimization of clustering and deep network and memory improvement,the clustering effect can be further improved.Finally,the obtained cancer subtypes were biologically validated by differentially expressed gene analysis and enrichment pathway analysis.In summary,methods proposed in this paper can combine information from multiple histological sources to accurately characterize and learn the multi-omics data,which provides a new idea for the development of personalized treatment plans. |