Font Size: a A A

Large Scale Image Content Analysis, Retrieval, And Automatic Annotation In Web Environment

Posted on:2010-10-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:C H WangFull Text:PDF
GTID:1118360275455551Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the prevalence of the Internet and digital cameras,there are more and more digital images on the Web.On the one hand,the increasing number of images attracts more and more users;on the other hand,it is not easy for common users to find what they really need from the sea of images.Therefore,effective and efficient image retrieval techniques have become an important research direction in both commercial and academic circles.Currently,there are mainly two image retrieval frameworks:text-based image retrieval (TBIR),which is widely used in commercial image search engines,and content-based image retrieval(CBIR),which becomes a hot research topic in academic communities. In text-based systems,images are indexed and retrieved based on textual information of Web images,where the quality of the annotations of images is one of most important issues in text-based image retrieval.In content-based image retrieval, images are indexed by their visual content,in which one key problem is the semantic gap between low-level visual features and high-level semantic concepts.In this dissertation,we try to fully utilize the rich textual and visual information of Web images to solve the above-mentioned problems in Web image retrieval. The following key techniques of Web image retrieval are discussed:automatic image annotation,image annotation refinement,reducing the semantic gap in Web image retrieval, and object-based image retrieval.Moreover,to better handle and utilize the large amount of data on the Web,and make users more convenient during the online retrieval process,we particularly consider the scalability and efficiency of the proposed algorithms and developed systems.The main contributions of the dissertation are as follows:1.We present a multi-label sparse coding framework for feature extraction and classification within the context of automatic image annotation.We claim that the semantic similarity of two images with overlapped labels should be measured in a reconstruction-based way rather than in a one-to-one way.Beyond the one-to- one similarity,the semantic similarities of label vectors and image features are both measured based on one-to-all l~1 sparse reconstruction/coding as introduced afterwards.2.We study the problem of large scale automatic image annotation,and a search-based image annotation framework is proposed.Under this framework,a online image annotation service has been deployed to annotate arbitrary images submitted by users in real time.A Web-scale image database is crawled from the Web,and used as the training set to annotate an arbitrary image.The application of both efficient search technologies and Web-scale image set guarantees the scalability of the proposed algorithm.3.We study the problem of image annotation refinement.We formulate the annotation refinement process as a Markov process,and based on which we explain some existing annotation refinement algorithms.In order to solve the problems in existing algorithms,we propose a content-based image annotation algorithm. Owing to the effectiveness of the Markov process formulation and the use of content information of the query image as well as training images,the proposed algorithm resolves the problems in existing algorithms to a large extent.4.We study the problem of bridging the semantic gap in content-based image retrieval on the Web,and propose a ranking-based distance metric learning algorithm. Piloted by the rich textual information of Web images,the proposed framework tries to learn a new distance measure in the visual space,which can be used to retrieve more semantically relevant images for any unseen query image. Based on the ranking-based distance metric learning algorithm,we propose a novel framework for large scale content-based image retrieval(CBIR).We also implement a real-time CBIR system on a 2.4 million Web images dataset.5.We study the problem of using multiple-instance semi-supervised learning to solve object-based image retrieval problem.A novel regularization framework for MISSL is presented.Based on this framework,a graph-based multiple-instance learning(GMIL) algorithm is proposed to solve MISSL problem.Un- der the proposed framework,GMIL can be reduced to a novel standard MIL algorithm(GMIL-M) and a standard SSL algorithm(GMIL-S).We theoretically prove the existence of the closed form solution for GMIL-S and the convergence of the iterative solutions for GMIL and GMIL-M.We apply the GMIL algorithm to solving object-based image retrieval problem.Experimental results show the superiority of the proposed method.
Keywords/Search Tags:Image Retrieval, Image Annotation, Image Annotation Refinement, Semantic Gap, Sparse Coding, Distance Metric Learning, Multiple-Instance Learning
PDF Full Text Request
Related items