Font Size: a A A

Research On Knowledge Extraction From Scientific Literature Of TCM Pulse Diagnosis

Posted on:2022-05-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q LiuFull Text:PDF
GTID:1484306320982649Subject:Chinese medicine informatics
Abstract/Summary:PDF Full Text Request
Object:This paper systematically combs the knowledge system of pulse diagnosis literature,extracts knowledge,constructs knowledge map and displays it visually.And explores the knowledge extraction and application research in the field of Traditional Chinese Medicine,so as to provide knowledge services in the field of pulse diagnosis.Methods:Firstly,we used text mining technology to sort out the knowledge of pulse diagnosis literature.By discussing the concept of pulse diagnosis,establishing retrieval strategy,collecting literature,and constructing domain dictionary with semi-automatic method,we could facilitated the word segmentation of Jieba tool in the field of TCM,traind the word2vec word vector of TCM for synonym combination,so as to construct the term document matrix,and use Gephi and Ecarts tools to draw relationship diagram and topic River diagram respectively for visual analysis.Secondly,the deep learning method was used to recognize the document data.The entity annotation system and specification are determined,and then the pre training model Albert was trained through the open Chinese medicine text resources.After labeling part of the Chinese medicine named entity data,the Albert-BiLSTM-CRF entity recognition model was trained to recognize the entities in the pulse diagnosis literature.Thridly,continue to use deep learning method to realize the relationship extraction of pulse diagnosis knowledge.After entity recognition,the sentences with entity pairs were classified into four relations,and the text RCNN relation classification model was trained by using the Albert word vector as the text representation method to extract the relations.Fourthly,The relationship classification results were constructed by neo4j tool,the extracted pulse diagnosis knowledge was displayed flexibly and visually,and the Knowledge Q&A system and intelligent Pulse Diagnosis System were constructed based on the edge weight and conditional probability of knowledge map.Results:1.A new definition of pulse diagnosis is given.It is considered that pulse diagnosis should be defined as the method of identifying TCM diseases and syndromes by using the pulse signals obtained from the superficial arteries of the human body.2.We collected 7969 articles about pulse diagnosis in CNKI,5048 articles in Wanfang database,264 pulse type terms,17926 clinical manifestation terms,3143 syndrome types,134 pulse map parameter types and 53106 prescription treatment terms.The word vector of word2vec is trained by using the published data of about 25000 Chinese medicine literatures,and 522 synonyms are obtained by combining with the editing distance.Through visual analysis,it was found that "Objectification of Pulse Diagnosis" and "experience of Veteran TCM Physicians pulse diagnosis" dominate the research of pulse diagnosis,and research is also carried out around "pulse diagnosis knowledge of classic ancient books" and "pulse diagnosis knowledge in clinical cases".3.In the field of entity recognition of TCM,"disease" can be defined as the set of "symptoms",and "prescription" is also the set of "traditional Chinese herbal",which can be combined into one entity to reduce the burden of the model.After tagging and evaluating the consistency of the entities in the literature data set,the trained Albert-BiLSTM-CRF entity recognition model finally achieved better F1 scores of 0.8909,0.8852 and 0.7758 in medical problem,treatment and syndrome entity types.4.Based on entity recognition,10081 samples were selected,and some of them were grouped and labeled.The trained TextRCNN relational classification model achieved 91.47%accuracy,and the F1 values of 0.9440,0.8252,0.8571 and 0.9444 were obtained in the quaternion relational classification method.5.Based on the results of relation extraction,two semantic relationships are inherited from the top ontology of TCM:phenomenon expression and treatment,and the knowledge map of pulse diagnosis is constructed.Based on the knowledge map,the knowledge of pulse diagnosis is well displayed,and the system of pulse diagnosis knowledge assistant learning is constructed.It gives a good feedback to the user's pulse diagnosis knowledge and intelligent pulse recognition requirements.Conclusion:1.At present,the concept of pulse diagnosis is not in line with the current reality.Since the development of Objectification of Pulse Diagnosis,it has automatically expanded the extension of the concept of pulse diagnosis in TCM.In the future,the intelligent era will need the definition of pulse diagnosis in line with the times.2.The research on the Objectification of Pulse Diagnosis and the experience mining of Veteran TCM Physicians pulse diagnosis are two main topics in the field of pulse diagnosis in the past half century.With the advancement of objectification of pulse diagnosis,the pulse map parameters obtained by electronic equipment have been gradually standardized,and more and more clinical significance of pulse map parameters has been obtained.The research on the experience of pulse diagnosis of Veteran TCM Physicians makes the idea of selecting acupoints and performing acupuncture accurately guided by pulse diagnosis in Huangdi Neijing glow with new life,mainly represented by "pulse acupuncture," and "renying cunkou pulse diagnosis method".At the same time,new guiding ideas of pulse science such as"systematic syndrome differentiation pulse science" and "pingmai syndrome differentiation"emerge.From the perspective of knowledge acquisition,the above work improves and enriches the knowledge system of pulse diagnosis.3.Some terms in the field of TCM have obvious patterns to follow,such as pulse and syndrome,so it can be used to construct a dictionary of related fields quickly.However,in the field of named entity recognition,there are some phenomena such as vague concept,unclear entity category and boundary,and habitual sketching.By taking "disease" as a set of symptoms as a kind of entity,the "prescription" as a set of "traditional Chinese medicine" is a kind of entity,which alleviates the problems caused by the fuzzy concept and the difficulty of classification.The model based on deep learning method has achieved better results,which is helpful to the entity recognition in the field of traditional Chinese Medicine.4.In this paper,TextRCNN is used to classify the text in the field of TCM and achieve the extraction of relationship,and the results are satisfactory.At the same time,it also shows that the deep learning method will be applied to the knowledge extraction of traditional Chinese medicine in the future.The results show that it can still achieve good results,which shows that the quaternion classification method has its inherent rationality,and can provide reference for future text classification in the field of TCM5.This paper not only provides the knowledge map of pulse diagnosis based on probability,but also realizes the query of knowledge map.Among them,the edge weight and conditional probability among entities can alleviate the uncertainty of semantic relationship of TCM to a certain extent.
Keywords/Search Tags:pulse diagnosis, knowledge extraction, knowledge map, knowledge engineering, text mining
PDF Full Text Request
Related items