Font Size: a A A

A general framework for classifier adaptation and its applications in multimedia

Posted on:2010-01-30Degree:Ph.DType:Thesis
University:Carnegie Mellon UniversityCandidate:Yang, JunFull Text:PDF
GTID:2448390002982745Subject:Artificial Intelligence
Abstract/Summary:
For the analysis and retrieval of multimedia data, machine learning techniques have been extensively applied to build models that map various feature vectors of the data into semantic labels. As multimedia data come from a wide variety of domains (e.g., genres, sources), each having its distinctive data characteristics, models trained from one domain do not usually generalize well to other domains. For example, the performance of semantic concept detectors trained from news video drops 60-70% when they are applied on documentary video. Meanwhile, it is prohibitively expensive to build new models for each and every domain due to the high cost for labeling training examples. Therefore, techniques for adapting models across different, domains are desirable for better performance and reduced human cost.;In this thesis, we investigate a generic adaptation problem in multimedia and other areas, which is to adapt supervised classifiers trained from one or more source domains to a new classifier for a target domain that has only limited labeled examples. The foundation of our work is a general framework for function-level classifier adaptation based on the regularized loss minimization principle. Fundamentally different from existing adaptation techniques, this framework adapts a classifier by directly modifying its decision function rather than re-training over the data in source domains, making it highly efficient and applicable to any type of classifier. Under tins framework, one can derive concrete adaptation algorithms by plugging-in any loss and regularization functions, among which we elaborate on adaptive support vector machines (a-SVM) and adaptive kernel logistic regression (a-KLR). We further extend this framework for multi classifier adaptation, namely adapting multiple existing classifiers into a classifier for the target domain, in a way that the contributions of these existing classifiers are automatically determined. We evaluate the proposed approaches in cross-domain semantic concept detection based on TRECVID corpora. The results show that our approaches outperform existing (adaptation and non-adaptation) methods in terms of accuracy and/or efficiency, and adaptation Irons multiple classifiers offers further benefits. We also demonstrate the effectiveness of our approaches in adapting classifiers of teat documents and of EEG data.;We then focus on improving the cost-efficiency of adaptation by selecting and prioritizing adaptation tasks involving multiple classifiers. We approach this problem by first conducting a comprehensive analysis of the generalizability of concept classifiers, which is related to the cost-efficiency of adapting a classifier. This analysis reveals strong correlations between generalizability and various meta-features of a classifier; ranging from model structure to the distribution of its output. We show that generalizability can be predicted quantitatively from these model meta-features using regression models. Based on the predictions of generalizability, we propose several selective adaptation methods for selecting the classifiers to be adapted and allocating their training examples such that they achieve higher overall post-adaptation performance than equally adapting every classifier.
Keywords/Search Tags:Adaptation, Classifier, Multimedia, Framework, Data, Models, Adapting
Related items