Font Size: a A A

Knowledge Base Empowered Natural Language Understanding

Posted on:2020-12-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:K Q LuoFull Text:PDF
GTID:1368330623963938Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Natural language is an important tool for human information exchange and knowledge preservation,and also the most important form used in human-computer interaction.To make the machine better understand natural language(NL)becomes the main direction of artificial intelligence at this stage,and also the hot research topic in academic fields.There exists massive things and relations between things in the world.With the growing number of structural knowledge in the World Wide Web,such as Wikipedia and IMDB,researchers had developed large-scale structured databases to store,organize and maintain those massive facts in open domains,and we call them Knowledge Bases(KB).Knowledge bases make use of standardized symbols to store more than tens of millions of entities and more than one billion facts that exist between entities.Therefore,the KB becomes an effective carrier of semantic representation,leading to a series of KB-based semantic understanding research.In this dissertation,we use the KB to realize the semantic understanding of multiple dimensions for natural language texts that describe objective facts.Considering different granularities of semantics,we conduct our research of the semantic understanding problem from three levels: entity,relation and sentence level.Entities are semantically indivisible elements,and multiple entities are connected by a relation,describing a single fact.While a sentence may contain several relations,thus holds a more complex semantics.For the understanding at the entity level,we directly mapping phrases in the text to specific entities in the KB.For the relation level,we attempt to represent natural language predicates by using specific structures made up of predicates in the KB.The understanding at the sentence level goes deeper than relation level,especially for questions,where we aim at automatically retrieving answers by the inference over the KB.Different methods are needed for the semantic modeling at different levels.For entity understanding,the kernel part is to calculate the degree of matching between the phrase and the KB entity.The classical entity linking task has the following characteristics:large number of candidate entities,ambiguity of phrases,and the interdependence of entities of different phrases.Moreover,we focus on the cross-lingual entity linking task for the text from web tables.In addition to the above features,how to leverage the semi-structure of table texts,and how to bridge the linguistic gap between texts and KBs,become the new challenges of the task.Thus,we propose a linking model based on neural networks and cross-lingual word vectors,which has the advantages of reducing the information loss caused by the translation process,learning the context and coherence features of the table row and column direction,and improving the overall link quality through the joint training framework.Experiments on both monolingual and cross-lingual scenarios show that our model effectively captures the special connections between entities in the table,and keeps a stable and good result.For relational understanding,the core is to describe the semantics of a NL relation with structures in the KB.It has two characteristics: First,the NL predicate has ambiguity;second and different from entity understanding,there exists semantic gaps between NL and KB predicates,hence it is difficult to achieve a simple one-to-one mapping.Based on these two points,we attempt to model the semantics of a relation via two granularities.The coarse-grained modeling focuses on the ambiguity of NL predicates.By constructing a richer hierarchical structure for types in the KB,we mine the different type combinations of the subject and object that a NL predicate holds.Experimental results show that our model outperforms the traditional selectional preference model.The fine-grained modeling aims to precisely express the semantics of relations using the KB.We use the human-understandable graph structure to describe the NL predicate,and propose a rule induction based inference model,which is able to express the complex semantics of NL predicates via schema graphs.We apply the structural representations to the knowledge base completion task,and the experimental results show that our schema graph inference model is not only highly interpretable,but also outperforms other rule induction model and the emerging knowledge base embedding model.For question understanding,we focus on the task of knowledge base question answering,that is to retrieve the answer entity set of the question from the KB.There exists one or more relations between the unknown answer and the other entities in the question,which brings a more complex semantics as well as the following challenges: how to describe such complex semantics,and how to effectively measure the similarity of the question and candidate semantic structures.The semantic matching model based on deep learning has been widely studied,but usually these target structures are limited,hence answering complex questions becomes a bottleneck of previous work.To this end,we propose the deep semantic matching model for answering complex questions,which follows the idea of graph based semantic representation used in relation understanding.We first generate candidate query graphs of the question,then encode such complex query structure into a uniform vector representation via deep neural networks,thus successfully capture the interactions between individual semantic components within a complex question.Experiments on multiple QA datasets show that our approach consistently outperforms existing methods on complex questions while staying competitive on simple questions.In summary,starting from the three granularities of entity,relation and question,this paper studies the problem of semantic understanding and matching between natural languages and knowledge bases.For entity understanding,we propose the link model based on deep neural networks,cross-language word vectors and joint learning scheme,for solving the entity linking task of tabular texts in cross-lingual scenarios.For the understanding of both relations and questions,to keep the interpretability of semantic modeling,we use the type combination of subjects and objects to describe the ambiguity of the relation,and use the graph structure based on the KB to describe the exact semantics of the relation or the question.For the task of question answering,our proposed deep learning model aims at the encoding of the entire query structure,which makes better use of the ability of feature learning,and more effectively measures the matching level between questions and complex semantic structures.Finally,hoping our work in this paper can help future academic researches in this field.
Keywords/Search Tags:knowledge bases, natural language understanding, entity linking, knowledge base completion, question answering, deep learning
PDF Full Text Request
Related items