Font Size: a A A

Heterogeneous Multi-source Domain Adaptation For Event Recognition In Videos

Posted on:2019-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:M Y YaoFull Text:PDF
GTID:2518306473454094Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Event recognition in videos has attracted much attention from researchers and plays an important role in computer vision.Consumer videos are usually captured under unconstrained conditions,and the contents of videos are complex and diverse due to the camera motion,background clutter and etc,which makes event recognition in videos a very challenging task.It is time consuming and labor expensive to annotate a large amount of videos as training samples,and traditional recognition methods can not train a robust classifier with insufficient labeled training samples.In recent years,transfer learning method has been proposed to overcome the lack of labeled training samples,and the core idea of transfer learning is to leverage the labeled samples from related domains to help learn a classifier model for consumer video domain.In this paper,we take a large number of loosely labeled Web images and videos as heterogeneous source domains,to conduct event recognition in consumer videos which are regarded as target domain.We propose a heterogeneous multiple source domain adaptation method to partition source domain into several semantic groups,and the weight of each group is calculated according to the distribution relevance between each group and the target domain in order to reduce the negative effect of uncorrelated group when learning the classifier in target domain.To learn an robust target classifier,a manifold regularization is introduced into the objective function to enforce the target classifier to be smooth on the target videos.The objective function is solved by using standard quadratic programming and support vector regression solvers.The proposed method achieves a high accuracy of event recognition on both CCV(Columbia Consumer Video)and MED(Multimedia Event Detection)datasets.In order to improve the transferability of heterogeneous features from source and target domain,we propose a deep adaptation residual network to learn a common feature space and obtain domain-invariant feature representations for source and target domain.This method takes advantage of the transferability of deep neural networks to reduce the distribution discrepancy between source and target domains by adding MK-MMD(Multiple Kernel Maximum Mean Discrepancies)losses to the higher layers of the network.In order to improve the transferability of higher layers of the network and ease the training process,the residual structure is adopted in the network.Experiments show that our network can improve the transferability of features from source and target domains and achieve good results of event recognition in videos.
Keywords/Search Tags:event recognition, transferring learning, deep learning
PDF Full Text Request
Related items