| Traditional Chinese medicine(TCM)symptoms are the vital basis of TCM syndrome differentiation and treatment.TCM physicians extracted information to determine diseases or syndromes based on the patients’ four primary diagnoses(observation,listening,questioning,and pulse analysis).However,in the current field of TCM research,more attention has been paid to the preparation of medicine and the prediction of TCM targets.The research on the molecular mechanism of TCM symptoms is still in the primary stage,and there are few studies on the molecular mechanism of TCM symptoms.Therefore,the research of TCM symptom-gene associations and their molecular mechanism is significant.Effective prediction methods could be applied for the prediction of TCM symptom-gene associations.In addition,forming high-quality associations is conducive to promoting research on the molecular mechanism and diversity of TCM symptoms.In this paper,we will focus on the prediction of TCM symptom-gene associations and their molecular mechanisms.The main research contents are as follows:(1)Due to the lack of direct associations between TCM symptoms and genes at the present stage,we constructed a large-scale heterogeneous network,which contained 9types of nodes and 13 types of associations,covering TCM symptoms,genes,diseases,herbs,formula,and other entities.Thus,we could obtain TCM symptom-gene association network in an indirect way.Based on this network framework,we proposed a heterogeneous network-embedding algorithm,called PTSGene,for predicting the gene relationship of TCM symptoms and tested the model performance through experiments.In addition,we conducted an overlapping analysis of TCM symptoms based on gene characteristics and depth characteristics,respectively.The results indicated that the low dimensional vectors of TCM symptoms and genes could integrate the structural information of the heterogeneous network.Experimental results indicated that PTsGene performed significantly better than the baseline algorithms.Meanwhile,functional homogeneity analysis,gene expression analysis of the candidate genes,and case study confirmed that our algorithm could effectively help identify more reliable novel candidate genes.(2)Aiming at the reliable prediction problem of TCM symptom-gene associations,we constructed a knowledge graph of TCM symptom gene associations by fusing the entities and associations from various biomedical knowledge databases.Then we proposed a knowledge completion method(called KGPTsG)based on tensor decomposition,which was applied to the task of TCM symptom-gene association prediction.Finally,we tested the performance of the method by experiments.The results showed that compared to the PTsGene method based on heterogeneous network embedding representation,the KGPTSG method significantly improved the prediction performance of the TCM symptom gene prediction task that PR and RE improved by76.38% and 76.19%,respectively.In addition,we conducted a secondary data enhancement experiment on the KGPTsG method to explore the effect of integrating more TCM symptom-gene associations.Similarly,gene functional homogeneity analysis also showed that the predicted candidate genes are relatively reliable.(3)To explore the molecular mechanisms of TCM-specific symptoms(tongue and pulse symptoms),we selected several common symptoms of tongue and pulse diseases in TCM diagnosis and combined them with their candidate genes which the above prediction methods had predicted.Then we carried out molecular pathway enrichment analysis based on Fisher’s exact test.The results showed that several molecular pathways were highly correlated with tongue and pulse symptoms,which is conducive to promoting the research on the molecular mechanisms of TCM symptoms. |