Font Size: a A A

Study On Restricted-domain Question Answering System Based On Ontology

Posted on:2011-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LuFull Text:PDF
GTID:2178360308955609Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Since the advent of Internet, more and more users try to get information by Internet. With the popularity of the Internet technology, the information becomes to expand. Therefore, people rely more on the search engine. In fact, with the development of the technology of the search engines, people's life has changed a lot. To try to access the information, the only thing you should do is to input several keywords. However, there are some disadvantages of the current search engines: (1) There are so much relevant information but not so accurate, so that the users have to spend a lot of time to acquire the needed answer. (2) The information retrieval depends on the keyword matching, which cannot express what the users really meant; therefore, it's in deed need that users can bring up a question in the natural language. (3) The information retrieval by the general search engines doesn't involve semantic retrieval.Compared to the search engines, the advantages of the question and answering systems are: (1) To allow the users ask questions in the natural language way; (2) The result returned by the system is not the list of web pages but accurate answers of the question. There are three types of Q&A systems: Chatter Bots, Q&A systems based on knowledge base and Q&A systems based on web. Chatter Bots adapt a pattern-matching method to solve the problems, of which the disadvantage is that it's hard to design the system aimed at a large-scale knowledge base; Web-based Q&A system organized the web information through web crawling and some other technology, and it's so complicated that it's hard to focus on semantic understanding in the Q&A systems. Based on this, the author talks about a restricted domain Q&A system based on knowledge base; proposes a method combined FAQ method and Q&A method based on the ontology knowledge base. The main tasks in Q&A system mentioned in this article are lexical analysis, syntax analysis, semantic reasoning and some other technologies as below:(1) CHMM based lexical analysis. Based on the ICTCLAS segmentation system, the author completes the following tasks: (a) Primary segmentation based on N-short paths method; (b) Unknown words identification based on HMM; (c) POS tagging based on HMM; (d) Keywords extraction method, taking nouns, verbs, adjectives and adverbs as the keywords.(2) Dependency grammar analysis based on LTP. The author mainly finish the following tasks in this section: (a) Dependency grammar analysis based on GParser method; (b) Based on pattern matching, transform the natural language to SPARQL query language.(3) FAQ Q&A strategy based on semantic similarity by keywords expansion . In this section the author proposes a method combined similarity of word form, similarity of sentence length and semantic similarity based on HIT synonyms library and HowNet to compute the semantic similarity between sentences. As needed, in this module, the author builds up a FAQ knowledge base in the travel domain.(4) Q&A strategy based on ontology knowledge base. Standing on the shoulder of the elders, in this section the author builds up the ontology knowledge base of the travel domain using SPARQL.The contributions in this thesis are as follows:(1) Based on the existing technologies, the author proposes a strategy combined FAQ method and Q&A method based on the ontology knowledge base, as well, demonstrates and implements a restricted domain Q&A system.(2) To the semantic similarity computation, the author proposes a method combined similarity of word form, similarity of the sentence length and semantic similarity based HIT Treebank and HowNet.(3) The author proposes a method to transform the Chinese natural language to SPARQL query language by dependency grammar.
Keywords/Search Tags:Information retrieval, Q&A system, Ontology, Restricted Domain, HMM, Semantic Similarity
PDF Full Text Request
Related items