Font Size: a A A

The Research And Implementation Of QA Techniques Based On Forum Data

Posted on:2008-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:B LuoFull Text:PDF
GTID:2178360212968179Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Traditional Information Retrieval (IR) techniques, such as Question Answering (QA) technique, Web search technique, provide plausible ways to find interesting information from Web. Their performances are not satisfactory, however, when people want find answers to specific questions. In the thesis, we try a new way to utilize answered question in forums to answer new questions. By collecting, processing forum resources and extracting answers from various replies to questions, we could build a structured question/answer database. A newly proposed question could be answered by being mapped to an existing resemble/similar question and retrieving corresponding extracted answers. The proposed QA technique makes use of search engine as the platform for indexing a very large amount of forum data and mapping questions efficiently, and has the advantage of avoiding technical difficulties and low accuracy of QA systems. One of core issues forum-based QA technique is how to extract high-quality answers effectively with forum threads. For the problem, we use CRF model to do multi-class classification and ranking SVM model to ranking all replies. After class definition, feature extraction, data annotation, model training and experimenting, the output proves that acceptable performance of answer extraction could be reached. At last, we implemented a real forum-based QA system—Fora. Fora is a well-designed and well-constructed system which has a complete and extendible architecture and easy-to-use user interface. We did specialized optimization on data crawling, data formatting, answer extracting, question mapping, UI designing, and etc.
Keywords/Search Tags:Information Retrieval, search, Question Answering, forum, answer extraction
PDF Full Text Request
Related items