| More and more studies have shown that long-chain non-coding RNA(lncRNA)plays a key role in various biological processes and is associated with a variety of complex diseases.Although biological experiments or clinical studies can find the relationship between lncRNA and diseases,it requires a lot of cost and time.Therefore,how to make use of fewer lncRNAdisease relationships and design corresponding computational methods to predict potential lncRNA-disease relationships has become an expensive and time-consuming effective way to solve traditional biological experiments,and it is also a current research hotspot.In recent years,researchers have proposed various computational methods to predict lncRNA-disease associations,and computational methods based on machine learning are an important category.Although prediction methods based on machine learning have achieved good results,most of these methods currently only use the single similarity between disease and lncRNA,and there are problems with a single source of data,sparse data,and low prediction accuracy.Therefore,in order to effectively alleviate the sparseness of data and improve the accuracy of lncRNA-disease association prediction,this paper aims to solve the above problems and propose a lncRNA-disease relationship prediction method based on manifold regularization non-negative matrix factorization.The specific research work includes :(1)Summary research on the similarity between lncRNA and disease.This paper summarizes the similarities between a variety of lncRNAs and diseases,including the expression similarity of lncRNA,the functional similarity of lncRNA,the ln RNA Gaussian nuclear similarity,the semantic similarity of diseases,the disease cosine similarity,and the disease Gaussian nuclear similarity.(2)A prediction method of lncRNA-disease relationship based on manifold regularization non-negative matrix factorization is proposed.This method uses the similarity network fusion method to separately integrate the two similarities between lncRNA and the disease,realizes the effective fusion of the two data sources,and solves the data sparsity problem of a single similarity matrix.Then,the method predicts the potential relationship between lncRNA and the disease by constructing a label weighting matrix and introducing a non-negative matrix factorization algorithm with manifold regularization constraints.It fully considers the relationship between the geometric structure of the data and effectively prevents the overfitting problem.Significantly improved forecasting performance.(3)Evaluation of forecasting methods.In order to verify the performance of the MRNMFLDA prediction method,this article first uses the leave-one-out cross-validation method to optimize the 6 parameters in MRNMFLDA to find the optimal solution.Then,in the experiments that use AUC,AUPR,PRE,SEN,ACC,F1 scores and MCC as evaluation indicators,leave one cross validation and five fold cross validation,compare MRNMFLDA with four current more advanced methods(PMFILDA,LDA-LNSUBRW,DSCMF,BRWLDA)for comparison.The experimental results show that the evaluation index values of MRNMFLDA are higher than the other four methods,achieving superior predictive performance.In addition,case studies show that MRNMFLDA can effectively predict lncRNAs related to three diseases(lung cancer,cervical cancer,and osteosarcoma). |