Font Size: a A A

Research Of Transfer Learning In Text Classification

Posted on:2015-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y XiaFull Text:PDF
GTID:2348330518970434Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Because of the globalization of the information technology, Data scale explosive growth. It is become more difficulty to deal with the complexity and non-structural mass data. There are more than eighty percent of the amount of all Internet information are documents because of documents occupy less resources than other data and can be upload and download easily, How to management the resources of these texts efficient and utilization became a big problem,So,here come into being Text classification technology. Text classification is the process to determine the category of a text according to the contents of the text automatically based on the classification system specified. The traditional text classification algorithms such as :K-nearest neighbor, support vector machines can not use being learned knowledge to solve new problems based on new data sets because they are based on statistical learning theory,and the training data set and the test data set must be distribute the same. Because of this we use transfer learning theory to solve the problem in text classification.Transfer learning is a fundamental concept in cognitive theory, the impact of one study to another study. Transfer learning allows us to use prior learning experience to help a new task to learn. Transfer learning based on instances is suitable to a text classification system. In this paper, we presents a new algorithms based on two-stage extraction because of the shortcomings of the tradition feature extraction algorithms using in transfer learning ,and new algorithm can effectively improve the accuracy of the feature extraction in transfer learning.Then we combine the Boosting technology and BP network to help the migration of the instance level, we built a good classification system by combine a large number of auxiliary data and a little target data, This system solve the problem of lack of training data in traditional classification system and the dependence to the data set of traditional classification system, And good results were obtained by experiments.
Keywords/Search Tags:transfer learning, text classification, the feature extraction, boosting technology
PDF Full Text Request
Related items