| In recent years,cardiovascular disease has become the primary cause of disease burden and death of residents in China.With the information management and digital development of medical services,a large number of free texts have been accumulated in the field of medical care.How to extract information from these data,and manage and apply it,is the key issue to promote the construction of intelligent medical care.The purpose of this paper is to automatically extract effective auxiliary diagnostic information from the literature of cardiovascular clinical diagnosis and treatment to construct cardiovascular knowledge map,and to apply it to the automatic question-and-answer system to provide efficient and accurate cardiovascular information services for doctors and patients.The details are as follows:(1)The study is based on unstructured knowledge extraction of long labels and negative sampling.In view of the overlap of the relationship between the Chinese clinical literature,the multi-head labeling strategy is adopted to improve the characteristics of the mainstream model TPLinker,and the problem of negative sample overshoot is handled by means of negative sampling and dynamic weighting.In order to improve the generalization ability of the model and solve the problem of lack of labeling material,the text data is enhanced from the aspects of synonym extension and parameter sharing.The proposed method increased the F1 indicators(the reconciling average of accuracy and recall rates)by 5.73 percent and 2.12 percent,respectively,in the cardiovascular clinical literature data set and the open data set.(2)To study the entity alignment problem under multi-data source knowledge extraction and construct cardiovascular knowledge graph.In view of the problem of knowledge overlap and knowledge fragmentation caused by multi-source heterogeneity,the entity alignment method based on retrieval and reordering is proposed.The proposed algorithm has F1 indicators of 83.04 percent and 80.52 percent,respectively,in its own data set and on the public data set.Finally,the cardiovascular knowledge graph is constructed with the modules of knowledge extraction,entity mapping and knowledge storage.(3)To study the knowledge map question and answer method based on semantic analysis.First of all,the physical recognition in the medical question-and-answer scenario is studied:improve Albert’s embedded layer,combine word and word information in a mixed coding way,and realize entity link by word matching.The proposed method has an F1 indicator of 92.97 per cent on its own question-and-answer data set,an increase of 1.81 per cent over the purely word-coded F1 indicator.Then the matching of relational attributes is realized based on Albert,and the countermeasure training mechanism is introduced to improve the robust generalization ability of the model.The accuracy of this method was increased by 3.50% and 3.21%,respectively,in its own data set and on the public data set.Finally,the semantic analysis of natural questions is completed through the combination of modules such as entity recognition and relationship property matching.Based on the above research,this paper realizes the construction of cardiovascular knowledge graph and applies it to the automatic question and answer system. |