Font Size: a A A

Transfer Learning Across Heterogeneous Feature Spaces

Posted on:2020-06-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y G YanFull Text:PDF
GTID:1368330590961715Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,machine learning techniques have achieved great success in many real-world applications.Traditional machine learning assumes that training data and testing data follow the same distribution,and requires sufficient training data to obtain an effective classifier.However,due to the expensive cost of data collection and annotation,we may only have a few labeled data for training.To address the issue of data scarcity in the domain of interest(i.e.,target domain),transfer learning is proposed to transfer knowledge extracted from the data in a related domain(i.e.,source domain)to the target domain.Since the data distributions of the source and target domains are different,it is non-trivial to leverage training data in the source domain to improve the learning performance in the target domain.Knowledge transfer across domains becomes even more challenging when the source domain and the target domain have different feature spaces,which is referred to as heterogeneous transfer learning.In order to leverage knowledge in the heterogeneous source domain to boost the classification performance in the target domain,this thesis performs a systematic study on heterogeneous transfer learning.The main contributions of the thesis are summarized as follows:1)We propose an online ensemble transfer learning algorithm,which employs multi-view co-occurrence data to measure the similarity between two heterogeneous data instances,and combines two predictive functions from source and target domains respectively.We also theoretically analyze the mistake bounds of the proposed method.2)We propose an online multi-view augmentation and learning algorithm,which employs multi-view co-occurrence data to augment the representations of data in two domains,and trains an online multi-view classifier to predict data in both domains.We also theoretically analyze the mistake bounds of our proposed method.3)We propose a joint learning model based on canonical correlation analysis to find a domain-invariant feature subspace to maximize the correlation between source data and target data,and at the same time to train a classifier for reducing the empirical loss.4)We propose a semi-supervised optimal transport model to exploit geometric information involved in source data and target data,and transport source data into the target domain,making the target data and transported source data follow a similar distribution5)For the data imbalanced issue,relying on the optimal transport theory,we propose an over-sampling method by exploiting geometric information involved in the data distribution of minority class data.In this way,we make the synthetic minority data and real minority data follow a similar distribution.
Keywords/Search Tags:Transfer Learning, Heterogeneous Transfer Learning, Online Transfer Learning, Multi-modal Data, Imbalanced Learning
PDF Full Text Request
Related items