Research On Answer Summarization Algorithm In Non-factoid Community Question Answering

Posted on:2018-07-24

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Song

Full Text:PDF

GTID:2348330512990267

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,we have witnessed a rapid growth in the number of users of community question-answering(CQA)services.Community question-answering services provide users a platform to post questions and find answers.And the rich data in CQA platforms has already gained researchers' interest.The task on which we focus in this thesis is answer summarization in community question-answering.While most previous work focuses on factoid question-answering,we focus on the non-factoid question-answering.Unlike factoid CQA in which questions are typically asking for an exact result and answers are typically short sentences,non-factoid question-answering usually asks for opinions and advice,so they would require paragraphs and passages as answers.Compared to traditional multi-document summarization task that usually focuses on summarizing news articles,summarizing answers in non-factoid CQA faces its specific challenges:the shortness and sparsity of answer sentences,and the diversity of topics of answers.To tackle these challenges,we propose a sparse coding-based summarization strategy that includes three core ingredients:document expansion of short answer sentences,sentence vectorization,and a sparse-coding optimization framework.Specifically,we extend each answer sentence in a question-answering thread to a more comprehensive representation via entity linking and sentence ranking strategies,utilizing Wikipedia as an external resource.From answers extended in this manner,each sentence is represented as a feature vector trained from a short text convolutional neural network model.We then use these sentence vector representations to estimate the saliency of candidate sentences via a sparse-coding framework that jointly considers candidate sentences and Wikipedia sentences as reconstruction items.Given the saliency vectors for all candidate sentences,we extract the top-ranked sentences to generate an answer summary based on a maximal marginal relevance algorithm.Our contributions in this thesis can be summarized as follows:we address the task of summarizing answers to non-factoid questions in community question-answering by tackling the shortness,sparsity and diversity challenges of answers.We also evaluate the performance of our proposed method in answer summarization of non-factoid CQA on a benchmark data collection,compared with several state-of-the-art baselines.The experimental results confirm the effectiveness of our proposed method,and moreover,its significant improvement compared to state-of-the-art baselines in terms of ROUGE metrics.Deeper analysis of experimental results also suggests the robustness and scalability of our proposed approach.

Keywords/Search Tags:

Community question-answering, Sparse coding, Short text processing, Document summarization

PDF Full Text Request

Related items

1	Research On The Key Technologies For Web Community Question Answering Retrieval
2	Automatic Summarization Of Multimedia Information And Related Technology Research,
3	Research On Tag Generation For Chinese Short Text Based On Community Question Answering System
4	Research On Question-type Sensitive Answer Summarization In Community Question Answering
5	Design And Implementation Of Intelligent Course Question Answering System In MOOC Environment
6	Research On The Re-use Of Community Question Answering Knowledge
7	Research And Application Of Key Technologies Of Community Question Answering
8	Study On Best Answer Policies In Community-based Question Answering Services
9	Information Need Based Answer Summarization For Community Question Answering
10	A Study Of Question Retrieval Technology In The Chinese Community Question Answering Systems