Font Size: a A A

Research On Intelligent Question Answering System Based On COVID-19 Knowledge Graph

Posted on:2022-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y C RenFull Text:PDF
GTID:2504306515472934Subject:Computer technology
Abstract/Summary:PDF Full Text Request
COVID-19 is known to spread around the world.As it is highly contagious,the number of COVID-19 infections worldwide is increasing.At present,the general public is increasingly interested in learning COVID-19 knowledge,especially the increasing demand for knowledge about symptoms and treatment,such as self-learning medical knowledge of COVID-19 through intelligent question-answering,online assisted consultation and other methods.The construction of COVID-19 knowledge graph was based on some open medical public domain knowledge graph,and the COVID-19 related entries in the encyclopedia website were taken as the main data sources of knowledge,and COVID-19 Diagnosis and Treatment Program and COVID-19 Epidemiology 110 Questions on the official website of the National Health and Family Planning Commission were supplemented to form the COVID-19 knowledge graph.In order to ensure the effectiveness of knowledge fusion,rules and entity alignment methods were used to construct a set of medical synonym entity database after knowledge acquisition of multi-source data,and on this basis,entity mapping technology was used to fuse the multi-source knowledge database.The entity content of COVID-19 knowledge graph is relatively complex and the amount of associated data is large,so Neo4 j graph database should be selected for storage.Neo4 j graph database can also be used to visualize entities and relationships in the form of association network.Based on COVID-19 knowledge graph as data source,an intelligent question-answering system is developed and designed using Python language.The system design process and related technologies are as follows: after the common COVID-19 questions are input into the system,the system first uses the bidirectional maximum matching algorithm to segment the questions,extracts the keywords according to the segmentation results,divides the types of questions,and matches different categories of Cypher query templates in the later stage;Then,the medical entity is identified based on the BERT-Bi LSTM-CRF model,and the dependency syntax of the question is analyzed by LTP-Parser tool to get the relationship between the words and the entity in the sentence,and then the triplet of the question is generated.Then the triplet of the question is matched with the corresponding Cypher query template,and the Cypher query statement is generated,and the triplet of the answer is obtained by executing the query in the knowledge graph.Finally,according to the triplet of the answer of different categories,the system will optimize its semantics according to the Chinese grammar rules to get a simple and understandable natural language answer feedback to the user.The main research contents and innovation points of this paper are as follows:(1)Construction of COVID-19 knowledge graph intelligent question answering system.In order to provide real-time consultation services for COVID-19,the encyclopedia website was used as the main source of knowledge in this study,supplemented by COVID-19 Diagnosis and Treatment Program and COVID-19 Epidemiology 110 Questions on the official website of the National Health and Family Planning Commission,to form a knowledge graph of COVID-19.By using key technologies such as Chinese word segmentation,named entity recognition,dependency syntax analysis and Neo4 j graph database,an intelligent question-answering system based on COVID-19 knowledge grapy is prelimatively implemented by using Python language to answer common questions such as COVID-19 symptoms,examination and treatment.(2)Research on named entity recognition for common Chinese medical questions.In view of the lack of tagged corpus in the field of common Chinese medical questions at present,this study constructed a tagged corpus in a manual way to lay a foundation for the knowledge graph intelligent question answering system in the medical field.BERT-Bi LSTM-CRF model is adopted for named entity recognition.By introducing BERT,the model can extract global and local features of text and generate word vectors with richer meanings.Meanwhile,the model also has the ability of Bi LSTM network to capture contextual semantic information and CRF annotation bias correction.The experimental results show that the effectiveness of the BERT-Bi LSTM-CRF model is much higher than that of the traditional Bi LSTM-CRF model.The BERT-Bi LSTM-CRF model has a good effect on entity recognition under the BIOE labeling scheme,with P value(Precision),R value(Recall)and F1 value(F1-score)reaching respectively 98%,97% and 97%.
Keywords/Search Tags:COVID-19, Knowledge Graph, Named Entity Recognition, Deep Learning, Intelligent Question-and-Answer, BERT BiLSTM-CRF Model
PDF Full Text Request
Related items