Font Size: a A A

Based Transfer Learning Chinese Question Classification Study

Posted on:2013-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:H B LinFull Text:PDF
GTID:2218330374965340Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The Q&A system is a new generation of search engines, it can better meet the requirements of the user's query, and retrieve the useful answer which the user want to get. Question Classification is the key part of the question answering system. The results of classification effect the accuracy of answer extraction directly. Generally, Question Classification model is built by a lot of labeled training corpus. However, when Question Classification model is built in different domains, a lot of labeled corpus are needed in every domain. And, label corpus is very expensive. There may be some relevance existed between different domains. Therefore, focused on the characteristics of question classification in different domains, the idea of transfer learning is used in this paper. It studies the feature selection of question classification in different domains and the transfer learning method of question classification model. The mainly completed characteristic work as following:1.According to the correlation between different domains, the question feature space of different domains is built on the feature mutual information in these different domains. First, from the question sample in different question domain, we select the high frequency words,interrogative and SVO etc., then make them for the feature words of different feature domains, and select large correlation feature words as the feature of different domains. Finally, get the feature value of different question domains with the word semantic similarity method.2.This paper proposed a question classification method based on feature mapping transfer learning. Firstly, statistic common feature between source domain and target domain, mining the similar feature from different domains on word similarity calculation. Secondly, change every question sample feature vector in source domain for the feature which is common feature or similar feature in target domain. Then, use a clustering algorithm to map the question sample in source domain to category of target domain. Finally, train the classification model by using support vector machine. The paper has an experiment that make travel domain as source domain and finance domain as target domain, the results prove that the classification accuracy is greatly improved in target domain with the help of labeled corpus of source domain.3.This paper design and bring out a Chinese question classification model based on feature mapping transfer learning.
Keywords/Search Tags:Q&A system, Chinese Question classification, transfer learning, related featureselection, feature mapping algorithm
PDF Full Text Request
Related items