Font Size: a A A

Study On Information Retrieval In Chinese QA System

Posted on:2008-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LiuFull Text:PDF
GTID:2178360215990930Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
QA technology is a very hot research direction in the field of natural language processing (NLP). It combines a large variety of NLP technologies. Presently, many research institutes have been investigating English QA technologies. Some mature English QA systems have been widely recognized. But few institutes are doing research on Chinese QA systems. No Chinese QA systems have been proposed up to now. In this paper, we try to investigate some technologies for Chinese QA systems.NLP-based QA system has five main parts: Question Analysis, Information Retrieval, Information Processing, Answer Extraction, and Frequently Asked Questions Module. Information Retrieval is one of the most important modules of the NLP-based QA system. The result of Information Retrieval has a great effect on following processing work, even on fording the correct answer. It is also the most important research aspect on Intelligent consulting System, Man-machine dialogue, and so on.In this paper, we research deeply on Information Retrieval, considering the character of Chinese and technique of computational linguistics. In the actual use of the QA system, the quality of answer is uneven. Traditional information retrieval use the four mathematical models: boolean model, fuzzy logic model, vector-based model, and probability model, but none of them take into account the quality of the answer. So we use perplexity, sequential pattern, and lexical collocation to predict the quality of document through maximum entropy method. We also show our quality measure can be successfully incorporated into the translation modeling-based retrieval model.Finally, we test our approach on a collection of question and answer pairs gathered from a community based question answering service where people ask and answer questions. Experimental results using our quality measure show a significant improvement over traditional Information Retrieval's model.
Keywords/Search Tags:question answering system, information retrieval, document quality, language models, maximum entropy
PDF Full Text Request
Related items