Font Size: a A A

Study On Sentence Type Theory And Its Applications In Chinese Question Answer System

Posted on:2011-07-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:C T LiuFull Text:PDF
GTID:1118330338982732Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Question Answering system (QA), is a kind of new information retrieval system which can be queried with natural language and return knowledge directly. Most Question Answering systems are compose of three main parts: question understanding, information retrieval and answer extraction.Currently the QA system has become very popular research domain, including the English QA system, Chinese QA system in recent years. There are many studies on English QA systems. Due to the Chinese characteristics, QA system in Chinese has some characters, sometimes it is completely different from English and other languages of QA system.Natural Language Processing (NLP) is the key technology in Question Answering system. To improve the QA system, natural language processing technology, especially the semantics analysis of questions and answers, need to be improved. But at present, the natural language semantics analysis technique is still in the primary stage. So, most of QA systems were not involved the semantic analysis or were only based on shallow studies of the semantic analysis. How to improve the semantic level of understanding in the QA systems should be the key to improve the level of QA system.The semantic meaning of a phrase or a sentence mainly consists of two parts: the meaning of components of the phrase or the sentence, and that of the structure. Research on sentence types is very important for the linguistics in the syntax structure level. Sentences of a language are infinite, but the sentence types are finite. Through the study of the finite sentence types to grasp the infinite sentences is the main goal of sentence types research. Since most sentences with same sentence type are usually with same syntax structure and same semantic interpretation, through the study of sentence types to implement the sentence semantic analysis, should be a feasible method to conduct semantic analysis of most sentences. For QA system, the sentcence types of interrogative sentences are more closely relationless with the interrogative semantic meanings. By analyzing the sentence types of interrogative sentences, the questions can be accurately understanded. And with the sentence types of questions and answers worked out according to the question classifications, the answers can be acquired more conveniently and accurately.In this paper, based on the study of the sentences system, a Chinese open domain question answering system, which named as Virtual Information Consultant (VIC), was studied. VIC can be queried with Chinese, and collectes the relative information from Internet or documents automatically, and returns the answers to users. Major work of the paper includes:1. Put forward the theoretical system of formalization definition of sentence type and the semantic calculation method based on sentence type. The syntactic isomorphism was adopted as the criteria of compartmentalizing the Sentence Types. And formally defined the sentence type with the generative grammar mode. By the description of the sentence type, the paper put forward three description modes which were: produce description which behaved as sentence type tree, string description which similar with the natural language sentence patterns description, and vector description. Due to the relationship between the syntaxs and the semantic structures of isomorphism sentences, most isomorphism sentences contented the same in semantics. By constructing the sentence type system and the rules of sentence semantic calculation based on the sentence types, the case role of a sentence can be calculate according to the sentence type of the sentence.2. Proposed an identification method of sentence types based on vector space model (VSM). The method calculating the similarity between sentences and sentence type models to identify the sentence's type. Namely, through the calculation of the relationship of the characteristic words, part of speech, and the sentence sequence of sentences, the sentence type of a sentence was identified. The first step of the method is reproducing the sentence type structures from the sentence and the sentence types about the sentence. And then caculate the similarity between the sentence types and the sentence type structures. And sort the results to realize the sentence type recognition. An analysis of a testing of the sentence type recognition method indicates that this method can get high accuracy identification when the syntax analysis is correct, and can get some correct recognition even if the syntax analysis is wrong. That says, the recognition method has good recognition effect and good stability.3. Proposed a question understanding method based on sentence type system, and constructed the frame of Chinese question-answering system. The sentence type is the classification of syntax structures, and the question classification is semantic classification for interrogative sentences. Through the sentence type to link the semantic classification and the syntax structure classification is the main idea of question understanding in this paper. Through the classification of interrogative phrases, construction of the question classification standards and question classification rules, the semantic meaning of questions was computed based on sentence type. And the normative question was got. Then by using the sentence type recognition technique, the Chinese question understanding was implemented. It is a common phenomenon in Chinese grammatical of non-interrogative usage of WH-phrases. By the study on non-interrogative usage of WH-phrases and the sentence types of non-interrogative WH-phrases, the level of question understanding was improved.4. Put forward an improved vector space model (VSM) retrieval algorithm which combined the natural document structures of paragraphs, sentences and words. And the information retrieval module suitable for VIC was designed.5. Implemented the question understanding subsystem, and the information retrieval subsystems of VIC. And studied the methods and strategies of answer extraction based on sentence types.
Keywords/Search Tags:Chinese Question Answering System, Sentence Type Semantics, Sentence Type Recognition, Normative Question Form, Information Retrieval
PDF Full Text Request
Related items