Font Size: a A A

Study On Question Classification In Chinese Question Answering System

Posted on:2012-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q NiuFull Text:PDF
GTID:2178330332990765Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the rapid development of computer technology and Internet, human people expect they could get the accurate target information from the mass information efficiently. Compared with the traditional search engines which based on keywords, the question answering system could satisfy people's retrieval requirements better. As the high-level information retrieval,the question answering system supports adopting Chinese natural language as the inquire condition, and returns the answer form as the results to users directly, which improves users' retrieval satisfaction and the cost of time greatly. Generally, question answering system contains three parts:question analysis, information retrieval and answer extraction. The question analysis the base of entire question answering system, in which, the question classification module is the foundation. Classifying questions could reduce the space of candidate answers and the time of searching accurate answer effectively, at the meantime, the types of information contained in a question, will determine the answer extraction strategy directly, the results of question classification will infect the performance and quality of entire question answering system directly, thus, researching the question classification has important significance for improving the performance of the question answering system.In related to the theories of question classification and research comprehensively, studying for an open field, and according to the Chinese realistic question, this dissertation do the main research on the question classification, and the research contents are list as followed: 1.About the question classification which based on the machine studying, before the research, the question should be represented to a kind of structural data that computer can identity. This paper selects the form of the vector space model for studying, in order to express the Chinese semantic category better, and through the analysis of Chinese question, it puts forward a new problem classification feature extraction method of integration of a variety of semantic,which means extracting the question word of question sentence,main sememes of core keyword in hownet, named entity, and single/plural as the classification feature, and also, each kind of characteristics has the corresponding extraction method.2.According to the complexity and diversity of Chinese natural language,in the process of extracting main sememes of core keyword in hownet,it getting especially important to determine a vocabulary's correct meaning.This dissertation puts forward the word sense disambiguation method which based on the sememes.This method finds the context of ambiguity words through the dependency relationship,according to the sememes relationship from context words meaning to ambiguity words meaning,it get knowledges which supervise word sense disambiguation,and complete the word sense disambiguation. Introducing it into the process of question classification can determine a vocabulary's correct meaning very well,and,to a certain extent,solve bad influences of the ambiguous words for the classification results.3.This paper designs many experiments which demonstrate validity of classification feature extraction strategy and necessity of introducing the word sense disambiguation method which based on the sememes,and find that SVM algorithm is more applicable to classification features in this article,the classification precision of coarse classes and fine classes reaches 92.82% and 84.45% respectively,and is better than other similar classification method.Finally this paper designs and implements a Chinese question classification System...
Keywords/Search Tags:question classification, support vector machine, hownet, dependency relationship
PDF Full Text Request
Related items