| Long non-coding RNA(lncRNA)has long been recognized by scientists as a translational impurity in the genetic process of cells.However,recent studies have shown that it plays an important role in the occurrence of many diseases.The researchers hope to analyze the relationship between lncRNA and disease to reveal the occurrence and development of the disease,and formulate relevant effective treatments.Traditional biological experimental methods to verify the relationship between lncRNA and disease are very time-consuming and expensive.Therefore,an effective method is needed to find out the potential relationship between lncRNA and disease in order to more clearly carry out verification biological experiments.In fact,in order to improve this situation,many bioinformatics scientists have created models that predict the association between lncRNA and disease.These models can effectively predict certain types of lncRNAs that are most relevant to the disease,which is convenient for biological experimenters to study the relationship between these types of lncRNA and diseases.The predictive ability of these models has achieved good results,but with the in-depth study of lncRNA and disease,more and more relevant information data can be added to the creation of predictive models to improve the accuracy of the models.In this paper,a locally constrained linear coding(LLC)and label propagation(LP)method is used to create an lncRNA and disease prediction model called LLCLPLDA.The main research contents of this article include the following:(1)The data needed to download and construct the model from the relevant database,including the known relationship between lncRNA and disease,the Lnc RNA expression similarity and the Disease semantic similarity.(2)Use the locality-constrained linear coding method to project the features of lncRNA and disease as local constrained features,thereby constructing a similarity matrix of feature lncRNA and disease.Then the label propagation strategy is used to mix the initial correlation matrix and the obtained characteristic lncRNA and disease similarity matrix,and finally the potential lncRNA and disease correlation score matrix is obtained.The local constrained linear coding method can retain the original information better,and the label propagation method can balance the original information and the supplementary information.(3)In order to verify the effectiveness of the model,it is compared with five advanced methods under the framework of global leave-one-cross-validation and five-fold cross-validation.The results show that the proposed model has the best performance.In addition,cases of cervical cancer,glioma,and breast cancer were investigated,and most of the potential lncRNAs predicted by the model have been proven to be related.(4)Aiming at the shortcomings that label propagation cannot be applied well to the new lncRNA and disease-related data,a weighted nearest neighbor method was added.This method can supplement the original matrix,so that the model can predict new relevant data. |