Font Size: a A A

Research And Application Of Knowledge Extraction Method Of Chinese Herb Literature

Posted on:2021-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:S WangFull Text:PDF
GTID:2428330611972222Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of modernization of Chinese medicine,literature resources related to Chinese herbal have grown rapidly.Many of the knowledge and relations of Chinese herb entities are hidden in these texts.How to mine meaningful entity relations from unstructured texts is currently a research hotspot in the field of information extraction,and it is also the basis for constructing knowledge bases or entity relation network(or KG).However,there are not many studies on the above-mentioned problems at present.Some of the existing research can be summarized into three aspects: First,research on relation extraction is mostly based on Chinese corpus,but the English literature also contains knowledge of herbal.The second is that the extraction methods are mostly based on traditional algorithms,and the accuracy is not very high,so it is necessary to do further research in combination with deep learning technology.The third is the use of extraction results,which should be combined with domain knowledge for further applications.Therefore,the main work of this paper is as follows:Therefore,the main work of this paper is as follows:1.Retrieved and collected English articles related to Chinese herbal medicine from Pub Med database.Combining the description of the relationship between traditional Chinese medicine and other entities in the literature,the two directional relationships between traditional Chinese medicine and disease,and traditional Chinese medicine and chemical substances are defined.With the help of medical workers,a corpus of entity relation extraction is constructed to realize the research on relation extraction.2.In order to improve the accuracy of entity relationship extraction related to Chinese herbal medicine,this article combines deep learning technology to carry out algorithm research.First,the SETATT-CNN model is proposed.The innovation of the model is reflected in the SEGATT layer with a segmented attention mechanism based on segmented input features.On the model training,a cross-entropy loss function with weight coefficients is designed.Secondly,in order to further utilize the high-order feature tensor,a relationship classification method based on mixed features is designed and implemented.This method obtains high-order semantic features by pre-training a deep learning model,and then combines feature vectors with different classifiers to improve the accuracy of relationship classification.3.Through the identification and acquisition of the main entity concepts and relationships in the field of Chinese medicine,combined with the entity relationships extracted in Chapter 4,the entity relationship network with Chinese medicine as the core is designed and constructed,and the entity relationship in English is connected to the entity relationship network in Chinese medicine.Come in.First,the top-level data model is defined according to the knowledge system of traditional Chinese medicine,which defines related entities and relationships.The entities include: Chinese herbal medicines,syndromes,diseases,prescriptions,etc.;relationships include: cure,composition,phenomenon expression,etc.Then extract the entity relationships defined in the top-level data schema to complete the instantiation and filling of data.Finally,through the construction of the thesaurus and the Chinese-English mapping,the relationship triples extracted from the English literature are connected to the entity relationship network with Chinese medicine as the core,and the integration of entities and relationships is realized.Finally,the entity relationship network is verified by Chinese medicine experts for its correctness.In order to verify the work of this article: First,experiments were carried out on three sets of data to verify the performance of the model.The experimental results show that: 1.The method in this paper is applied to the extraction tasks of the relationship between herbal medicine and disease,herbal medicine and chemical substance.After comparative analysis with other related methods,the model in this paper has achieved good results.2.For further verification on the Bio Creative V data set,compared with the current model that uses deep learning methods for feature extraction,the method designed in this paper has an F value that is about 2.7% higher than the best result.Secondly,in order to facilitate the retrieval and use of the constructed knowledge base,this paper designs and completes a visual retrieval system.This platform is oriented to domain experts to realize the management of entity knowledge and relationships.For users,it has various retrieval functions such as TCM entity knowledge retrieval and entity relationship query,which simplifies the display and retrieval of entity relationship networks,and users can more directly view and study entity relationships.
Keywords/Search Tags:TCM, PubMed, herbal, entity relation extraction, entity relation network
PDF Full Text Request
Related items