Font Size: a A A

Transfer Learning For Video Annotation

Posted on:2016-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y FengFull Text:PDF
GTID:2308330476954968Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Videos are often used to record people’s daily life nowadays. Automatically annotating videos captured by ordinary users is becoming an important topic in computer vision. Video annotation can make users more easily manage and retrieve their videos and it can also be applied in intelligent video surveillance.The content in the videos captured by ordinary users varies very much and the videos usually contain a lot of occlusions and camera motions. So the videos belonging to the same class will become very different from each other, which makes content-based video annotation a very challenging problem. To achieve good performance, traditional video annotation methods require large number of manually labeled training samples. However, collecting and labeling training samples is time consuming and labor expensive. When there are only a few training samples, traditional annotation methods are less likely to become robust and generalized. Recently, transfer learning methods have been proposed to hand the problem of how to obtain a robust model when the training samples are insufficient. Transfer learning method can leverage data in related domain to train the classifier for the domain which we are interested in.In this paper, we mainly discuss how to leverage Web videos(source domain) to annotate consumer videos(target domain). As search engines have become increasing mature, we can obtain a large amount of video related to the search keyword from the Internet with the help of search engines. However, the obtained videos are always poor in quality. Either they are heavily compressed by the Web server or they are not closely related to the search keyword. As a result, Web videos are quite different from consumer videos. Due to the difference, we cannot get a good classifier for consumer videos if we train it using Web videos directly. So we propose a Multi-Group Adaptation method, which can divide the Web videos into semantic groups and assign different weight to each group to reduce the negative effect of uncorrelated Web videos in the training process. When there are multiple classes, the one-vs-the-rest strategy will lead to the problems of unbalanced training data and different output scales. We also propose a Multi-class Domain Adaptation method which can overcome the problems of one-vs-the-rest. Comprehensive experiments demonstrate the effectiveness of leveraging Web videos to annotate events in consumer videos and that the proposed method is better than existing methods in solving this problem.
Keywords/Search Tags:event recognition, transfer learning, video annotation, domain adaptation
PDF Full Text Request
Related items