Font Size: a A A

Research On Large-scale Image Retrieval

Posted on:2019-07-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y C GuoFull Text:PDF
GTID:1368330590451540Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Large-scale image retrieval is widely used in real-world applications,like Web image retrieval,surveillance image retrieval,large-scale face recognition,and etc.Its key problems are active research topics in artificial intelligence,machine learning,computer vision,and information retrieval communities.Large-scale image retrieval has been attracting considerable attention from both academia and industry,which is one of the research frontiers with great benefit theoretically and practically.There are two fundamental tasks for large-scale image retrieval,image feature re-trieval which searches images based on the feature similarity between images,and image semantic retrieval which searches images based on the semantic relatedness between images.They are different views of image retrieval and complementary to each other.Computing similarity between image feature vectors is the most important part in image feature retrieval.However,it should be noted that images themselves are represented by multiple kinds of features so that it requires specific algorithms for cross-modality feature matching.Moreover,in real-world settings,the retrieval system needs to handle large-scale database with considerable noise in complicated and various scenarios which necessitates an efficient,robust,and general similarity computing framework.On the oth-er hand,image semantic retrieval needs semantic recognition models,like classifiers,to extract the semantical information in images for further retrieval.However,in real-world applications,there are a large number of semantic labels which have totally different data density.In this case,there are only a few or even no labeled images for training the recog-nition models,which is called as zero-shot learning and few-shot learning challenges.To address these issues,this paper makes the following novel contributions:1.To address the cross-modality retrieval problem caused by the multi-feature property of images,a Collective Matrix Factorization Hashing framework is proposed.By using the known correspondence between features,a common latent space shared by different spaces is constructed.Then different features are mapped into the common space for similarity measure.Compared with existing frameworks,the proposed framework is capable of building effective connection between modalities for accurate cross-modality retrieval.It is extended in both unsupervised and supervised settings.The theoretical properties are given.It achieves state-of-the-art cross-modality retrieval performance.2.To handle the large-scale database with massive noise under different scenarios,a general and robust vector quantization framework is proposed.It utilizes vector quan-tization technique for fast similarity computation.A novel?p,q-norm quantization loss is proposed which is more robust to data noise and generalizes well under different similar-ity measures.To solve the challenging orthogonality constrained?p,q-norm minimization problem,an efficient iterative algorithm is proposed and the rigorous theoretical proof of convergence is given.The framework is applied to several noted vector quantization approaches.Experiments on benchmarks verify its better robustness and generality.3.As for the zero-shot learning problem faced by building semantic recognition models for image semantic retrieval,a novel zero-shot learning framework based on sample transfer is proposed.It is systematically different from existing works.Based on the label missing,ambiguity,and the relationship among labels,pseudo labeled samples for target classes are collected with zero cost.The recognition models for each target classes are constructed from the pseudo labeled samples.It turns the zero-shot learning problem into a supervised learning problem,which is more flexible.The proposed framework can be applied to all zero-shot settings,and achieves superior recognition accuracy.4.Based on zero-shot learning,a novel learning concept is put forward,termed as cross-class transfer active learning,which simultaneously transfers knowledge among different classes,and selects the most informative samples for expert annotation.This learning strategy is able to save more than 70%labeling efforts to achieve the same recognition accuracy,which lays the fundamental for large-scale image semantic retrieval.
Keywords/Search Tags:Image Retrieval, Efficiency, Robustness, Cross-modality Retrieval, Zero-shot and Few-shot Learning
PDF Full Text Request
Related items