Question Similarity Computation And Classification In Community Question Answer

Posted on:2014-02-24

Degree:Master

Type:Thesis

Country:China

Candidate:D P Xiong

Full Text:PDF

GTID:2248330395499948

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The development of Internet technology has brought convenient for more and more people’s daily life, which makes people drown in a sea of information. It is difficult to find needed information timely. This is the phenomenon of information overload. With the rapid development of Web2.0, people hope to be able to use natural language to exchange knowledge in a community to get the needed information. As a result, a large number of community question answer systems arise at the historic moment that meets the requirements of the people. On the one hand, in the community question answer, users can ask question and wait for other users to answer or retrieval directly the ask question and then get the answer. This is the problem of question similarity calculation. On the other hand, as time passed, community question answer system has accumulated a lot of QA pair archives which need to be classified correctly to guarantee the robustness of the system. This is the problem of question classification. Therefore, the main works of our paper are as follows:Firstly, while the traditional question answering (QA) systems, such as the TREC QA task, only directly find answers to simple questions and do not suffice to answer real-world questions, and without user interaction, the community-based QA systems (CQA) contain large available QA pair archives which can be used. We propose a new retrieval framework based on LDA topics to solve the similar questions matching problem from the questions of statistical, semantic and theme of information to calculate the similarity between questions. Statistical information is about a VSM-based retrieval model; semantic information is about a WordNet-based retrieval model; theme of information is about the subject of LDA-based retrieval model. Finally the overall similarity is a combination of the three similarities.Secondly, in community-based question answering services, on the one hand, when user submits a question, he doesn’t know about the answer so that he can’t sure the suitable category of the question. The user can post the question without choosing the suitable category. After that, we classify the question using the answer of the question that has been settled. So as to avoid user to randomly tag the question a category that lead to chaos of classification system. On the other hand, when a CQA site becomes unwieldy because of new topic appear leading to inappropriate category setting, it needs to change its classification. We can classify the questions using the answers since the questions have been settled. Therefore, question classification is very important for CQA sites. We propose two methods to solve these problems. Firstly, we present a general classification model, which combines the question classifier and answer classifier using the surface text. Secondly, by the mapping function, we can enrich questions by leveraging answer semantic knowledge to tackle the data sparseness then use the SVM classification.Finally, experiments are carried out on a real-world annotation data set which is sampled from Yahoo! Answers and we have used several evaluation indexes to evaluate the experimental result. The experimental result demonstrates the proposed methods have improvements over traditional methods and good results have been achieved.

Keywords/Search Tags:

Community Question Answer, Question Similarity, Question Classification, Machine Learning

PDF Full Text Request

Related items

1	Key Issues Of Question Understanding In Community Question Answering System
2	Question Analysis And Answer Extraction Of Chinese Question Answering System
3	Research On Question-type Sensitive Answer Summarization In Community Question Answering
4	Research And Application Of Question Classification And Answer Evaluation In Community Question Answering System
5	Research And Design Of The Question Answering System Of The Dean Mailbox
6	Research On Question Answering Technology For Answering History Subject Question
7	The Study On Question Retrieval Technology In Community Question Answer System
8	Mutual Promotion Of Question Retrieval And Answer Ranking In Community Question Answering
9	Research On The Re-use Of Community Question Answering Knowledge
10	Answer Selection For Non-factoid Question