Font Size: a A A

Methods Of Phrase-based Question Understanding And Answer Generation

Posted on:2021-05-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L WuFull Text:PDF
GTID:1368330647959131Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
The intelligent question answering system has evolved from rule dependence to semantic understanding,and its performance has improved continuously.With the advent of the Web 2.0 era,community question answering(CQA)systems have become the primary way to acquire knowledge for people gradually.In CQA,users play an important role that shares and manages knowledge.Researchers have also begun to study the importance of users in question answering systems.The most significant challenge of CQA is to guarantee the speed of answering.So,researchers tried to find relevant answers automatically when an asker posts a question.It includes three parts: understanding the semantics of natural language questions,finding the most relevant answer from existing answers,and generating new natural language answers based on knowledge.Our thesis employs phrases to represent questions and their similar relationship for solving the problem of answer selection and answer generation in CQA.We introduce parsing for the phrase mining,which improves the performance effectively.We extend the word embedding to the phrase embedding creatively,use a neural network to learn the vectorized representation of phrases,and employ the vector distance to represent their semantic similarity relationship.Based on the phrase embedding,Our thesis propose two CQA methods: community answer selection method based on the heterogeneous information network and community answer generation method based on the knowledge graph.Our research contents are as follows:(1)Parsing-based Quality Phrase mining method.Most phrase mining methods employ the N-gram to obtain candidate phrases.It expands the search space of candidate phrases and introduces meaningless candidate phrases.In response to this problem,this thesis employ syntax parsing for the phrase mining.First,we use parsing trees to represent the structure of sentences.Then we traverse them to retrieve word sequences with independent semantics as candidate phrases,which improves the quality of candidate phrases effectively.Next,we propose an index for evaluating the Significance of phrases and combine it with the existing indicators to assess the candidate phrases comprehensively.Finally,we further optimize the phrase quality by the syntactic disambiguation.Experimental results show that the Parsing-based Quality Phrase mining improves the precision by 6% and enhances the candidate phrase conversion rate by 7%,comparing with the most advanced phrase mining methods.(2)Distributed phrase embedding based on the neural network.Semantic similarity has always been an important topic in natural language understanding.Our thesis propose three Phrase2 Vec methods that embed phrases into a vector space.They use vector distance to represent the semantic similarity and effectively improve the accuracy of text semantic representation.We also apply them for text classification and text clustering.Experiments show that Phrase2 Vec leads baseline methods about 1% and 5% respectively in the word similarity task and the phrase similarity task.(3)Community answer selection method based on phrase fusion heterogeneous information network.It is a critical challenge in CQA that finding the matching answers from complex entity relationships.In response to this problem,our thesis employs a heterogeneous information network to represent complex entity relationships and fuse a phrase information network to show question semantics.Then we transform the community answer selection task into the shortest path problem of a heterogeneous information network.We propose a type-constrained Top-k similar entity query method to solve the answer selection task.Experiments show that our approach improves the precision by 3% and reduces the mean absolute error by 1%,comparing with the most superior answer selection methods.(4)Community answer generation based on the knowledge graph.Most community answer generation algorithms ignore the importance of user background.In response to this problem,our thesis employ a phrase fusion heterogeneous information network to represent entity relationships in CQA,then model the user's background knowledge based on post records.Finally,we combine the question semantics and the user background to query related knowledge.The extracted entities are converted into natural language answers.Experiments show that our approach leads the excellent baseline methods with 3% precision in the answer generation task.
Keywords/Search Tags:Phrase mining, Phrase embedding, Community answer selection, Answer generation, Entity correlation, Knowledge graph
PDF Full Text Request
Related items