Font Size: a A A

Research Of Answer Extraction Of Web-based Question Answering

Posted on:2015-07-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:H SunFull Text:PDF
GTID:1108330485491668Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Web-based Question Answering is to answer question with search snippets, it could leverage the high quality search engine results and eliminate storing huge amount of documents. Answer Extraction is to generate answers from texts, including candidate generation and ranking. It’s e ff ected by search engine snippets’ noises. In this thesis, I explore how to alleviate the problem from the following aspects:To solve the problem that many occurrences of correct answer are not with evident features, this thesis proposes to generate candidate answer from Passage Graph in which useful message in other passages can propagate to current one and helps to improve the generation results. Experiments show that using Passage Graph can help to improve answer extraction results.To deal with search snippets’ noises, this thesis proposes a Pruned Rank Aggregation method to combine the list of di ff erent extraction engines, then employs a rank creation model to generate final ranking. Experiments show that the ranking results are significantly better than state-of-the-art methods’.To address the problem that search snippets are with diff erent expressions against question, this thesis propose to use word embedding to compute similarities, including a phrase-based text similarity and semantic similarity between candidate and answer type. Experiments show that by employing those two similarities,ranking results are e ff ectively improved.To address the problem that search snippets express diff erent against question, this thesis discusses the usage of paraphrase generation. This thesis proposes a joint learning method of a dual machine translation system along with a metric to evaluate paraphrase. Experiments show that by employing the proposed paraphrase technique, ranking results are improved.In all, this thesis uses a Passage Graph model to generate candidates; then it employs a Learning to Rank framework to perform candidate pruning and ranking. Further, it employees word embedding and paraphrasing to improve ranking. It reduces search snippets noises’ impacts and makes Answer Extraction of Web-based Question Answering achieve better results than state-of-the-art methods’.
Keywords/Search Tags:question answering, answer extraction, graphic model, learning to rank, word embedding, paraphrase
PDF Full Text Request
Related items