Font Size: a A A

Based On Matrix Factorization For CircRNA-disease Association Prediction

Posted on:2022-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:H ChenFull Text:PDF
GTID:2480306782453434Subject:Biomedicine Engineering
Abstract/Summary:PDF Full Text Request
With the development of bioinformatics,circRNAs with covalent closed-loop structures have been shown to play important roles in various biological processes.Circ RNAs are involved in organic processes,and their disruptions and mutations can lead to disease development.Therefore,prediction of disease-related circ RNAs can facilitate biomarker identification and disease prevention and treatment.However,the molecular mechanisms and functions of circ RNAs in disease progression and pathogenesis are still unclear.Researches on circ RNA-disease associations are highly dependent on biological experiments,which are time-consuming and expensive.Therefore,there is an urgent need for a computational method to infer potential circ RNA-disease associations.In recent years,some methods that combine matrix decomposition methods with depth matrix decomposition have been proposed.However,a considerable amount of relevant biological data has not been fully utilized,and there is still much room for its improvement.In order to make full use of the relevant biological data and the application of deep matrix decomposition algorithm to predict potential circ RNA-disease associations with better performance,this thesis proposes a method based on deep matrix factorization with multisource fusion to predict potential circ RNA-disease associations(DMFMSF).DMFMSF first selected several useful circ RNAs and disease similarity information,and then integrated them by similarity nuclear fusion(SKF).In addition,the impact of missing unknown associations on the method is eliminated by weighted k nearest neighbors(WKNKN).Finally,potential circ RNA-disease associations were inferred by mining linear and nonlinear features applying singular value decomposition(SVD)as well as deep matrix factorization.The performance of DMFMSF was strictly evaluated by leave-one-out validation(LOOCV)and five-fold cross-validation(5-fold CV)on two benchmark datasets.For LOOCV,the AUC values of DMFMSF for these two benchmark datasets are 0.945 and 0.938,respectively.For 5-fold CV,the AUC values of DMFMSF for these two benchmark datasets are 0.920±0.001 and 0.912±0.002,respectively.The experimental results showed that DMFMSF outperforms several existing computational methods.In addition,five important diseases,hepatocellular carcinoma,breast cancer,acute myeloid leukemia,colorectal cancer,and coronary artery disease were applied in the case study.The results suggest that DMFMSF can be used as an accurate and effective computational tool for predicting circ RNA-disease associations.The advantage of the attention mechanism is that it can be used within a deep framework to identify important factors and ignore unavailable factors.The DMFMSF model is further optimized by using the attention mechanism instead of the deep matrix factorization method,resulting in the new model DMFMSF-X.The model performance is evaluated by LOOCV and 5-fold CV.For LOOCV,the AUC values of DMFMSF-X for these two benchmark datasets are 0.949 and 0.941,respectively.for 5-fold CV,the AUC values of DMFMSF-X for these two benchmark datasets are 0.923±0.001 and 0.915±0.001,respectively.The results show that the improved model DMFMSF-X outperforms than DMFMSF.In addition,case studies of DMFMSF-X showed that the method was effective in predicting cervical cancer,lung cancer and glioma-associated circRNAs.
Keywords/Search Tags:circRNA, disease, similarity kernel fusion, deep matrix factorization, attention mechanism
PDF Full Text Request
Related items