| The rise of mobile internet has led to a huge amount of visual content being generated every moment,such as videos,images,etc.Understanding these visual contents through machine learning is crucial.Traditional machine learning relies on large amounts of labeled data for supervised training.However,in practical applications,manually labeling large amounts of data is time-consuming and labor-intensive.A more common scenario is the label-scarced scenario.In this case,traditional machine learning methods would perform poorly or even fail to work due to lack of labeled data.To address this challenge,this dissertation utilizes labeled data that are different from but related to the target task as an auxiliary to transfer knowledge to solve the problem of underlabeling.These auxiliary data for training are either different in distribution or in category from the target data,the former corresponding to the domain adaptation task and the latter to a typical zero-shot learning task.This dissertation is dedicated to the research of domain adaptation and zeroshot learning for visual content in label-scarced scenarios.The research content of this dissertation mainly includes the following parts:Firstly,this dissertation proposes a suite of domain adaptation methods for visual content from two aspects:subspace learning and adversarial learning.Subspace learning maps samples from the original space into a subspace shared by the source and target domains to reduce data dimensionality and task-irrelevant information,which is easier to aligning the domain distribution.Based on subspace learning,this dissertation proposes a novel graph structure to preserve the manifold structure of data and improve the discriminative ability between categories.At the same time,this dissertation also investigates the role of sample reweighting and alignment of first and second order statistics in domain adaptation.On the other hand,adversarial learning-based methods trains a pair of generators and discriminators to learn domain invariant representations.Based on this,the dissertation proposes to learn explicitly transferable representations to fill in the gaps between domains,while fixing the original network parameters to avoid degradation in domain adaptability of the representations.In addition,this dissertation improves the existing adversarial domain adaptation learning mode by proposing a novel adversarial domain adaptation method based on the mixup rate of the cross-domain mixed representations.The novelty of this method lies in the fact that the previous adversarial methods train the generator so that the discriminator could not distinguish between the source domain and the target domain,while this method trains the generator so that the mixup rate estimator could not accurately predict the mixup rate of cross domain mixed representations.Secondly,the real-world environment is very complex,and the traditional domain adaptation methods may not be able to adapt.Therefore,this dissertation also discusses three special domain adaptation settings and proposes corresponding methods to solve them.First,in open set domain adaptation,the target domain contains categories that are not present in the source domain,and the traditional domain adaptation methods may be biased towards the prediction of known classes.To address this imbalanced problem,this dissertation proposes a domain adaptation method based on centroid alignment and extreme value theory,which has a balanced classification performance for known and unknown classes.Second,in the source-free domain adaptation problem,this dissertation proposes a variational model perturbation method,which fixes the parameters of the source domain model and applies slight perturbations to it to adapt to the target domain.These perturbations are learned through variational inference.Finally,this dissertation proposes a fast domain adaptation protocol to deal with the slow reasoning speed of the domain adaptation method in edge devices.Thirdly,this dissertation proposes two zero-shot learning methods based on generative model.The existing methods based on the generative model have two problems.One is that they cannot guarantee the generation diversity of similar attributes,and the other is that they cannot ensure that the generated samples are highly related to the real samples and the corresponding semantic descriptions.This dissertation proposes two solutions to address these issues,one based on semantic invariance constraint,and the other introducing visual semantic bilateral connections.Fourthly,this dissertation extends zero-shot learning to the field of cold start recommendation.Zero-shot learning and cold start recommendation originally belong to different fields and have been independently studied.This dissertation reveals that zero-shot learning and cold start recommendation are actually two different extensions of the same intension.For example,they both attempt to predict unseen categories and involve two spaces,one for direct feature representation and the other for supplementary semantic description.This dissertation builds the cold start recommendation problem as a zero-shot learning problem for the first time,and proposes a low rank linear autoencoder to solve both problems simultaneously.Finally,this dissertation summarizes the above research content and looks forward to future research directions. |