Font Size: a A A

Research On Question Feature Model Combining With Ontology In Chinese Question Classification

Posted on:2011-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z A PanFull Text:PDF
GTID:2178360305971751Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Question Answering system, as a very active branch in the field of natural language process, its research content is to understand users'questions in natural language and automatically give the exact answer from the large-scale informations. Question Answering system generally includes three modules: question process,information retrieval and answer extraction. In the module of question process, question classification is the core content. Question classification can not only effectively reduce the candidate answers and improve the accurate rate of the correct answer, but also its providing answer type information determine answer extraction strategy.The primary task of question classification is the mathematical expression of question datas, that is to select rational question feature model to express question feature, while a high performance feature model will have a very important role in the following question classification. Therefore this paper focuses on the detailed discussion on question feature model. On the basis of comprehensive study and intensive research on the relevant technology, the emphasis work we do is the following aspects: Firstly, Ontology is the basis of sovling information sharing and exchange of the semantic level in the semantic web. On the basis of intensive research on ontology technology, we use protégéto build an experimental university domain ontology knowledge base (University Domain Ontology, UDO), as a domain ontology applied in question processing of Chinese question answering system, to explore the role of the domain ontology in Chinese question answering system. Then we use Jena to parse ontology knowledge base, and store in mysql database according to the tree structure, in order to achieve the mapping from owl to relational database, for the foundation of the following research of question process aspects.Secondly, having done intensive research on the information entropy and related algorithms in the information theory, we present a new weight computing strategy of question feature vector. In this paper, on the basis of the university domain ontology knowledge base, introducing the concept of information entropy in information theory, we use information entropy algorithm of feature word weight in vector space model, combining with ontology conceptual model to build the question feature model.Finally, Support Vector Machine (SVM), as a new machine learning method based on statistical learning theory, is attracted widespread attention with its good classification performance, and has fruitful research results.It has many particular advantages on resolving such problems as small sample, nonlinearity and high dimension. Based on the summary and analysis of the principle and multi-class classification methods of SVM, we apply it to Chinese question classification, verifying the validity of the question feature model we present. Then by experimental comparison, we compare the performance of a variety of multi-class classification algorithm.
Keywords/Search Tags:question answering system, question classification, information entropy, UDO, support vector machine
PDF Full Text Request
Related items