Text-aided Image Classification: Using Labeled Text From Web To Help Image Classification

Posted on:2011-04-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y Lin

Full Text:PDF

GTID:2178360308452441

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

As more and more multimedia data become available on the Word Wide Web, mining on those data is palying an increasingly important role in web applications. Noticing that there exists large amount of labeled text data on the web, and considering that it is much easier to represent and mine knowledge from text data compared with from multimedia data, people want to investigate the interplay between multimedia data and text data, hoping that could help us understand multimedia data better. Thus, maximizing the benefit from text information becomes a very crucial problem in multimedia data mining area.In this paper, we address the image classification problem to seek a gate of mining across media data space and text data space. We solves the problem of image classification with very limited amount of labeled training images, in an approach we called text-aided image classifier (TAIC). This problem is important in practice, since currently on the web, labeled text data are usually much more than imgae data. To solve the problem, based on the bag-of-words view and the Naive Bayes (NB) classification model, we focus our attention on the estimation of image feature distribution of target concept, under the help of rebundant labeled text data and image-text co-occrrence data on the web.Specifically, we extend the traiditional NB algorithm by considering a mapping which we called"feature mapping"that maps into the image feature space the most discriminative text features we found in labeled text training data. This procedure is based on the abundant image-text co-occurrence data on the web, which acts like a bridge that connects text and image knowledge. The essence of our algorithm is to use a text feature distribution based on enough labeled text data to estimate the image feature distribution under the same target concept.Our emprirical results on real world data sets show that our method makes a good approximation of image feature distribution when trained with abundant labeled images. In case labeled images are very limited, the classification performance is greatly improved by using auxiliary labeled text data. Finaly, our mixed classification model which accepts both labeled images and text as training data achieves better classification performance under various sized training image sets, which shows that our method can indeed integrate text and image knowledge and improve the performance of image classification, in a simple yet effective way.

Keywords/Search Tags:

Image classification, co-occurrence, feature mapping

PDF Full Text Request

Related items

1	Feature extraction and dimension reduction with applications to classification and the analysis of co-occurrence data
2	Sar Image Surface Features Based On Svm Classification
3	Study Of Improving The Target/Image Classification Performance
4	Research On Bag Of Visual Words Based Image Classification
5	The Ceramic Tile Image Classification Based On Texture Feature
6	Reconfigurable Computing Technology Based Research On Image Recognition And Classification System
7	Research On The Method Of Image Texture Feature Extraction And Classification
8	Image Semantic Representation Based Scene Classification Research
9	Comparision Of Classification Method And Classification Accuracy Assisted With Texture
10	The Research Of Image Retrieval Based On Sal Ient Region And Motif Co-occurrence Matrix Feature