Font Size: a A A

Reaserch On Knowledge Transfer In Machine Learning

Posted on:2011-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:W LiuFull Text:PDF
GTID:2178360308465575Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Traditional machine learning is based on statistical machine learning, the learning task is to learn a classification model based on the given adequate training data, and then use this learned model to classify the test data. There exists a basic assumption: the training data and test data are drawn from the same distribution and the same feature space. But in practical problems, this assumption is often not true, so that the learned model cannot apply to the test data well. Therefore, the traditional machine learning techniques on the solution to these problems are ineffective. This often requires us to re-labeling a large number of training data in order to meet our training needs, but the annotation of new data is very expensive, requiring considerable manpower and resources. From another point of view, if we have a lot of labeled training data under different distributions, it is wasteful to discard the data entirely. Resonablel use of these data is the major task of transfer learning.At present , based on the difference of the labeled data in the source data sets and target data sets, transfer learning can be divided into three categories: inductive transfer learning, transductive transfer learning and unsupervised transfer learning, of which the first two kinds of transfer learning are today's hot spots. According to the object of transfer learning, the transfer learning techniques can be divided into four categories: Instance-transfer, whose aim is to select and extract instances which are helpful for the training of the target data set ,and re-weighting the instances to be as accessorial training data to help target data space; Feature representation transfer, whose aim is to find a"good"feature representation that reduces difference between the source and the target domains and the error of classification and regression models; Parameter-transfer, whose aim is to discover shared parameters or priors between the source domain and target domain models, which can benefit for transfer learning; Relational-knowledge-transfer, whose aim is to build mapping of relational knowledge between the source domain and the target domains. Both domains are relational domains and i.i.d assumption is relaxed in each domain.In this paper, inductive transfer learning is our key research. Based on summarizing the mainstream transfer learning techniques, we proposed three kinds of algorithms.Integrated transfer learning based on dynamic dataset regrouping. First a large number of old labeled data has been random split averagely, then combine these sub-blocks with the small amount of new labeled data sets to get a number of reconstructed data sets. We train a classifier for each set of training data to get an integrated classifier, then use this integrated classifier to update the weight of each instance, finally we get the final integrated classifier.Inductive transfer through neural network error and dataset regrouping, we first train a neural network classifier model use the labeled target domain dataset, and then input each instance of the source domain dataset, then initial the weight of each instance via to the output error. Then train a classifier for each new training dataset, finally we get the final integrated model.Transfer learning based on vector translation and fuzzy clustering. In order to make the source data and target data to have more intersections as possible in the feature space, we use the way of vector translation to make them duplicated as possible, and then use the centers of each class of the target dataset as the clustering centers on the translated source data set for fuzzy clustering ,to get the fuzzy membership to each class for each instance as its weight, finally train classifier.
Keywords/Search Tags:distribution, inductive transfer, dataset regroup, neural network error, vector translation
PDF Full Text Request
Related items