Font Size: a A A

The Design And Implementation Of Question Answering System Prototype Based On Hownet In The Restricted Domain

Posted on:2011-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:C Y YangFull Text:PDF
GTID:2178330332471562Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Question Answering system is a new generation of information retrieval systems which can provide more accurate searching results. It allows users to ask questions in natural language form, can quickly return accurate, concise, satisfactory answers to users after user's questions. Restrict domain QA system refers to the system which can handle questions only restricted to some specific domain or content scope, like limited to medicine, chemistry or business field of a company.Based on the specificity of restrict domain QA system, this paper do analysis and research on collection and organization of restrict domain FAQ, construction of domain knowledge database, question processing (including stop word filtering , answer sorting, candidate questions'extraction), similarity computing and answer extraction between questions, FAQ database update and related technology theory. It also implement a freshman QA sytem. The maim research are listed below:(1) Collect domain information to build QA by kinds of ways, and transfer EXCEL file which storage QA to XML file; then construct index to these QA by Lucene; by use of knowledge structure of"HowNet", concept description method and KDML language, this paper describe extracted scope specific word and merge with the HowNet knowledge base.(2) Using ICTCLAS, split users'questions and then do stop word filtering to form a key word set, determine uses'questions classification by Naive Bayes algorithm to this set, and find questions contain key word in the FAQ database using Lucene inverted index, after judging whether this question is equal to user's question classification, compute Num j(similarity between user's question and the j'th question in the FAQ database), at last ,choose the first 50%(changeable) larger questions as candidate question set by Num j.(3) After introduction and comparison to some question similarity algorithms, do computing to question and user question in the candidate set one by one by similarity algorithms, and return answer to the most similar question to user, and finally introduce classification, on the questions which not included in the user question FAQ database using Naive Bayes algorithm, to convenient update of FAQ database. (4) By use of research results above, implement a freshman QA system.
Keywords/Search Tags:QA system, restrict domain, question processing, candidate set, question similarity
PDF Full Text Request
Related items