Font Size: a A A

Research On Automatic Question Answering System In Restricted Domain Based On Chinese Weighted Keywords Tree

Posted on:2012-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:C J LiFull Text:PDF
GTID:2178330332490193Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Internet, more and more available information are provided for the people. People hope that they can get the information which they want quickly and accurately. The Question Answer System (QAS) is produced under such background. Question answering (QA) is a new information service technology with nature language processing (NLP), information retrieval (IR), semantic analysis and artificial intelligence (AI) be used synthetically. Compared with the traditional search engine, QAS is questioned by natural language sentence. The system analysis and understands the questions and then returns the answers which users want to know.The QAS has been widely researching in domestic and foreign and there has been mature systems in foreign. Recent year, Chinese information processing and management technology's progress enormously promoted Chinese auto answering system's development. As a result of the complexity of Chinese natural language and the limitation of processing technology, making computers understand human language completely is very difficult. Therefore there is no mature Chinese QAS. At the present stage, the research and application of Chinese QAS are mainly about restricted domains.Based on above background, a new QAS in restricted domain based on Chinese weighted keywords tree is proposed. In this paper, the main research contents contains: keywords extraction and weighting; keywords tree's construction and inference. There are some main innovation points in this article as follows:(1) The article carries on the detailed analysis to the keywords'attributes and divides them into two kinds: word main body attribute and word phrase attribute. We extract seven main attributes from the two kinds and evaluate them, then regard them as the basis of keywords'weighting. We rebuild the weighting indicators using PCA and compute the keywords'final weighting score. This score is the evaluation criteria of a keyword's importance for a sentence.(2) The article creates its own semantic tree and calls it"keywords'tree". We use this tree to manage domain knowledge. This article combines the concept of class with tree and proposes a new knowledge storage concept of keywords'tree. According to the features of the domain knowledge, we describe a series of operations of keywords'tree's construction, storage and update in detail.(3) In the article, we take keywords'tree as the central to design the algorithm of sentence similarity computation. Through computing keywords'weight and class, we compute sentences'similarity, reduce the knowledge hunting zone and carry on sorting to the similar questions.In a word, the article unifies semantic analysis and statistical analysis theory method and proposes a set of Chinese QAS based on keywords tree in restricted domain. Through experiment, the article enhances the auto-answering system's efficiency and accuracy effectively. These findings in the information consultant, E-government and Coupe have the good theory significance and use value.
Keywords/Search Tags:QAS, Keywords tree, Multistage knowledge library, Keywords weight
PDF Full Text Request
Related items