Font Size: a A A

Design And Implementation Of Question Answering System Based On Document Archive Knowledge Graph

Posted on:2022-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhangFull Text:PDF
GTID:2518306746452014Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,a large volume of Internet documentary archive data with different structures and sources is generated every day.Accordingly,both domestic and abroad have started research on intelligent retrieval of document archives.Combining knowledge graph technology with intelligent question and answer systems can achieve more accurate and convenient information retrieval.The purpose of this thesis is to construct a knowledge graph of document archives and realize an intelligent question and answer system of document archives knowledge based on the knowledge graph.The following three parts are mainly studied and implemented(1)The knowledge graph technology for document archives is studied and implemented: the thesis firstly uses a seven-step approach to build the document archive ontology,and then proposes a joint labeling strategy for unstructured document archive text data to address the shortcomings that are prone to error propagation in a pipelined knowledge extraction task.In addition,a Bert-Bi Lstm-CRF-based joint knowledge extraction model is implemented to complete the task of joint extraction of entity relations of clerical archives.Finally,Neo4 j is used to store the knowledge graph of document archives.(2)A question-answer model for the knowledge graph of document archives is studied and implemented: due to the sparsity and missing entity links in the knowledge graph of document archives,an ALBert-Embed KGQA-based model is proposed.The model uses Compl Ex-based knowledge graph embedding to get the representations of all entities in the knowledge graph.Using ALBert as a question embedding,the representations of the interrogative sentences are obtained.Finally,after knowing the representation of the knowledge graph,the representation of the question sentence and the representation of the question entity,the answer selection is performed by the scoring function defined by Compl Ex,and the triad with the highest score is selected as the answer to the question sentence.The model effectively solves the problems caused by incomplete knowledge graphs while greatly saving training time.(3)A Q&A system with knowledge mapping of document archives is built: By analyzing the overall architecture of the system and integrating knowledge mapping and Q&A model,this thesis builds a web-based Q&A system to enhance the usability and human-computer interaction capability of the system.The system is built using a front-and back-end separation model and a lightweight framework called Flask.In practical use,the system works well and is able to provide accurate answers to users' questions in time,which meets the needs of users in acquiring knowledge of documents and files.
Keywords/Search Tags:Document archiving, Knowledge Graph, Question and answer system, Federated knowledge extraction, Knowledge graph embedding
PDF Full Text Request
Related items