Applications Of Short Text Similarity Assessment In User-interactive Question Answering

Posted on:2011-06-28

Degree:Doctor

Type:Dissertation

Country:China

Candidate:W P Song

Full Text:PDF

GTID:1118360305966707

Subject:Computer system architecture

Abstract/Summary:

With the dramatic development of the Internet and the emergency of Web 2.0, Question Answering (QA) becomes a new Information Retreival (IR) technology. Unlike search engines which return a few relevant documents, QA systems give one or several exact answers for each user question, which is more preferable. However, traditional automatic QA systems suffer from poor answer quality problem because it is very difficult for machine to understand human's question well. To solve this problem, User-interactive QA systems have been developed and become a very popular Web-based service. Unlike the traditional automatic QA systems which totally obtain answers automatically, the user-interactive QA systems serve as interactive platforms for users to help each other with human-provided answers, which overcome the shortcoming of poor quality of the automatic answers.Short text similarity assessment is very important in user-interactive QA systems because questions and answers are usually short text. Question/answer processing depends on better understanding the semantics of questions and answers and measuring their similarity. The applications of short text similarity assessment in user-interactive QA systems mainly include frequently asked question (FAQ) answering, question categorization and answer clustering. In this dissertation, we focus on these three applications. The research contents and contributions are as follows:First, a novel question similarity calculation method based on semantic space for FAQ answering is proposed. At first, a semantic space is constructed based on accumulated questions. Then questions are mapped into the semantic space and represented by vectors and finally the question similarity is calculated based on these vectors. By the semantic mapping, questions representation is semantically enriched. We also use semantic feature clustering to eliminate the redundant information.Second, an automatic method of question categorization in user-interactive QA systems is proposed. In the method, some important words extracted from accumulated questions are selected as features to construct a feature space and represent each category as a vector in the feature space. For each user question, it is also mapped into the feature space and the similarity between the question vector and each category vector is calculated. The similarity scores are sorted in the descending order and the top k ranked categories are recommended to the user. The semantic patterns are also used to identify and weight the topic-wise words in each question. These words play more important roles than other words for question categorization. Finally, an answer clustering method is proposed, in which all the answers of the same question are clustered into some clusters according to their content or meaning. Moreover, a representative answer is selected for each cluster. In this way, users can get the information of each cluster quickly by only reading the representative answer. In the proposed method, there are two important parts:answer similarity calculation and clustering algorithm. For the answer similarity calculation, a combined method with statistic similarity and semantic similarity is adopted. For the clustering algorithm, an incremental algorithm is designed to reduce the time complexity.

Keywords/Search Tags:

User-interactive Question Answer System, Short Text similarity, Automatic Question Answering, Question Categorization, Answer Clustering

Related items

1	Research And Implementation On Answer Acquisition For Question Answering Systems
2	Research On Question Text Classification And Answer Extraction Technology In Automatic Question Answering System
3	Question Analysis And Answer Extraction Of Chinese Question Answering System
4	The Research And Design Of Automatic Question Answering System
5	Research On Question-type Sensitive Answer Summarization In Community Question Answering
6	Research On Robot Question Answering System For Freshmen Register In Colleges And Universities
7	Research On Question Correlation And Answer Ranking Based In Question Answering Community
8	Research On Domain-Dependent Automatic Question Answering Method
9	Research On Visual Question Answering Method Based On Answer Mask
10	Study On Best Answer Policies In Community-based Question Answering Services