| Recently,the morbidity and mortality of breast carcinoma are both increasing gradually.As a result,it has become a hot issue in cancer research to make precise prognostic prediction for breast cancer patients,which can help not only to effectively avoid overtreatment and medical resources waste,but also to provide scientific basis to assist medical staff and patients’ family members to make right medical decisions.Breast cancer is a malignant tumor disease,whose emergence and development are closely related to genes.With the advancement of DNA sequencing technique,large-scale omics data has been accumulated in the field of bioinformatics,which paves a solid foundation for researchers to comprehensively understand biological processes.In the study of breast cancer survival prediction,gene expression data reflects the biological characteristics of tumor from the micro biological level,which has important application value for cancer prognosis and treatment.The clinical data contain abundant pathological features,which provide theoretical basis for survival prediction of breast cancer patients.How to effectively integrate gene expression data and clinical data for prognostic prediction of breast cancer is an urgent problem in the field of cancer survival prediction.However,existing breast cancer survival prediction models tend to use a single feature selection method to extract the features of gene expression data,and then conducts simple spicing and fusion of feature data,which can easily lead to the loss of important gene information and the neglect of the correlation between omics data.Therefore,this sort of methods does have limitations.Based on existing research of breast cancer survival prediction,this paper proposes a model based on deep learning and omics data fusion.Firstly,an improved nonnegative matrix factorization algorithm(Multi_NMF)is proposed to extract the feature genes related to breast cancer survival.Then,a deep neural network based on attention mechanism(AMND)is constructed to fuse gene expression data and clinical data.Finally,on the basis of above researches,a deep neural network model based on multi-scale feature fusion(MFFD)is introduced.The experimental results show that compared with existing methods,the method in this paper has better prediction performance.The main research contents of this article are summarized as follows:(1)Based on the nonnegative matrix factorization(NMF)algorithm,Multi_NMF feature selection algorithm is proposed.This method can not only extract high-level features of gene expression data,but also avoid the problems of sparsity and loss of important feature information caused by matrix decomposition.The experimental results show that the improved Multi_NMF method can select more informative genes related to breast cancer prognosis and thus more accurate prediction can be obtained.(2)In order to explore the effectiveness of omics data for the breast cancer survival prediction,this paper proposes a deep neural network model based on omics data and attention mechanism(AMND)to combine gene expression data and clinical data.As an initial attempt of attention mechanism in the breast cancer prognosis,AMND method is able to better consider the connection between clinical data and gene expression data and self-adaptively fuse of feature genes from different feature extraction methods,improving the accuracy of breast cancer survival prediction.The experimental results show that AMND method can accurately predict the survival time of breast cancer patients.(3)In order to solve the problem that the model cannot effectively study due to the small number of samples,this paper proposes a deep neural network model based on multi-scale feature fusion(MFFD).It combines the features of different granularity of the group data,and contains more feature information.Through performance evaluation on the test set,experimental results show that MFFD further improves breast cancer survival prediction performance. |