The purpose of named entity recognition is to identify specific categories of entities from massive amounts of unstructured information.For the medical field,the named entity recognition technology can realize the key information recognition and integration of multi-source heterogeneous Chinese medical text information,forming high-quality structured information,providing basic data for decision-making,intelligent consultation and other applications.In recent years,the research on entity recognition mostly focuses on the representation methods of text features such as characters and words in data sets.However,due to the strong correlation and professionalism of text entity types in the medical field,there are still many problems in the research of Chinese medical named entity recognition.1)Current research on medical named entity recognition fails to consider the correlation between text representation and medical entity relationship,and lacks the use of the relationship information between entities;2)The existing research methods lack the consideration of entity professionalism in medical texts,and lack the representation of medical terminology knowledge and the way of integration with texts;3)Lack of text annotation system for medical vertical field,and it is difficult to realize effective combination of model pre-annotation and manual annotation.Besides,it is difficult to obtain refined professional medical data set due to the lack of data set constructed for knowledge graph of subdivided medical field.In view of the problems and challenges facing the above Chinese medical named entity recognition tasks,the main work of the paper is as follows:1)Proposed a Chinese medical named entity recognition method based on information enhancement of inter-entity relationship.By using the specific relationship information existing between the entities before and after,the relationship information between the entities is integrated into the text representation,and use weight allocation strategy to valid combine the relationship information between the medical entities and the text information to improve the ability of medical named entity recognition.2)Proposed a Chinese medical named entity recognition method based on bidirectional joint embedding of encyclopedia knowledge and original text.Using Baidu encyclopedia knowledge of medical entities to adequately and effectively supplement the information of entities with specific categories in the dataset,and enhance the representation of original text semantic information through two-way joint network to improve entity recognition ability;3)Build a visualized and easy-to-use Chinese medical knowledge graph data annotation system.Based on the above two research contents,the generation of high-quality professional data sets and the iterative use of data can be realized through model pre-annotation and manual annotation for entity category,relational category and joint annotation for Chinese medical texts,which can support the establishment of medical knowledge graph.4)Realize hypertension knowledge map structure design and data set construction.Based on the above system,multi-source heterogeneous cross-domain data were effectively fused for multi-source,massive and heterogeneous medical text data,forming a knowledge graph structure of hypertension field with clear hierarchical structure and clear classification,realizing efficient integration and data set construction of hypertension field text data.Finally,the platform and constructed data set in this paper have been applied in the National Natural Science Foundation of China project"Research on knowledge graph fusion and knowledge graph completion of Graph Neural Network for medical knowledge graph",and have been applied in "Intelligent decision-making system for hypertension medication" in cooperation with the Hypertension Department of Beijing Anzhen Hospital. |