Font Size: a A A

Research On The Key Technologies Of Knowledge-based Question Answering

Posted on:2024-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:S LiuFull Text:PDF
GTID:2568307100973179Subject:Computer technology
Abstract/Summary:
Knowledge-Based Question Answering(KBQA)supports users to describe their questions using coherent natural language,which can capture the deep semantic meaning of users’ question more completely and return a single and accurate answer.However,the current research of KBQA still limited by poor transferability of entity linking models,low recall of large-scale knowledge graph retrieval methods and error propagation due to pipeline architecture.To address the above issues,this research investigates entity linking,large-scale knowledge graph retrieval methods and general solution methods of KBQA through deep learning methods.The main work accomplished is as follows:1.Proposed a model DME for automatic mining of domain information in short texts for explicit knowledge enhancement of entity linking tasks.To address the challenges of increasing model complexity and poor transferability due to the lack of contextual information in short text.A model DME(Domain information Mining and Explicit expressing)is proposed to automatically mine domain information in short text and explicitly representation.This model uses domain keywords and Naive Bayes model to mine domain information in short texts and enhance the knowledge of the original data accordingly.DME improves the training effect without changing the model architecture and solves the difficulty of poor transferability of current entity linking models.At the same time,a negative sampling strategy that considering both semantic and morphological information of the text is proposed to address the problem that the current negative sampling strategy is ineffective.In the experimental part,language models such as BERT are selected as the baseline model,and the original data with and without DME enhancement are used for training and comparison respectively.The experimental results indicate that training with DME-enhanced data can improve the accuracy of different models by about 4-9%.Compared with the random negative sampling strategy,the proposed negative sampling strategy saves about 36%of time overhead and improves the accuracy by about 9% in the training of the BERT model.2.Proposed a KBQA retrieval model for large-scale knowledge graphs based on semantic hashing.To address the difficulties that the knowledge graph retrieval methods suffer from search space limitations,low recall rate and difficulty in taking semantic information into account because of the huge number of entities and relationships in large knowledge graphs.A knowledge graph retrieval model consisting of an initial recall module based on word form retrieval and a secondary recall module based on semantic retrieval is designed and proposed.The initial recall module achieves coarse-grained recall of candidates based on word form matching,and the secondary recall module uses semantic hashing to achieve fast semantic similarity recall of large-scale candidates,which effectively solves the problems that current methods cannot retrieve the whole graph.This approach effectively solves the problems that current methods cannot retrieve the whole graph,low recall and difficulty in taking semantic information into account.In the experimental part,the proposed retrieval model is compared with the popularitybased and inverted index-based retrieval models for the recall rate among the first hundred candidates.The experimental results show that the proposed retrieval model in this paper improves the recall rate by about 15% and 10% over the best results of the baseline model in the entity link task and the relationship detection task,respectively.3.Proposed a multi-task learning parameter sharing mechanism and a parameter selflearning joint loss function for KBQA.To address the difficulty of error propagation due to the organization of subtasks by the pipeline architecture used in the current KBQA,this research uses a multi-task learning framework for subtasks’ joint training by designing a parameter sharing method that can efficiently interact with features and dynamically adjusts the joint loss function at different training stages by means of weight self-learning.In this approach,the model sharing layer and the joint loss function are designed to optimize the adaptability of the multitask learning framework for KBQA tasks,and to solve the current error propagation problem caused by subtask models trained separately and cascaded in a pipeline architecture.The experimental results show that the proposed method improves the accuracy by about 2% on the entity mention recognition task and about 8% on the entity link and relationship detection task compared with the methods of Wang et al.and Cheng et al.
Keywords/Search Tags:deep neural network, knowledge-based question answering, knowledge enhancement, semantic hashing, multi-task learning
Related items