Font Size: a A A

Research On Perception Oriented Image Retrieval And Automatic Image Annotation

Posted on:2010-07-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:S H FengFull Text:PDF
GTID:1118360275963223Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of multimedia technology and computer network,the content-based image retrieval(CBIR) system becomes more and more important to organize,index and retrieve the massive image information in many application domains,which has emerged as a hot topic in recent years.The main difficulty of CBIR lies in how to make computers understand the semantic information of images from the human's perceiving view,and narrow down the well known semantic gap between low-level visual features and high-level semantic concepts.The former part of this thesis mainly focuses on the human perception oriented image retrieval algorithm, especially about how to extract the semantic information from the image and how to effectively integrate into the human's high-level semantics to improve the retrieval performance.The latter part of this dissertation mainly focuses on the automatic image annotation,especially about how to establish an effective machine learning model to resolve the annotation problem,as well as how to improve the effectiveness of training samples in order to refine the annotation performance.For the region-based image retrieval,the author argues that in most cases the user is only interested in a portion of the image,and the rest of the image is irrelevant.In order to resolve such ambiguous problem,a totally data-driven,selective visual attention model based image retrieval algorithm is proposed.Firstly the saliency map is generated via the attention model,and both salient edges and salient regions are extracted automatically by fusing the edge map and segmented image with the corresponding saliency map,which can be regarded as the user's retrieval intention.Then the effective feature descriptors are proposed and fused for the final image semantic retrieval.For the image retrieval task which combines machine learning theory with relevance feedback mechanism,the dissertation focuses on the graph-based semi-supervised learning algorithm with application to region-based image retrieval.Different schemes which both incorporate the region saliency into the graph-based semi-supervised learning framework are applied to deal with two types of feedback.Firstly,in the case that no sample or only positive samples are available from the user's feedback,the retrieval task can be resolved via a transductive learning manner,a hierarchical graph model which incorporates region saliency information is constructed and the manifold-ranking algorithm is adopted subsequently for positive label propagation.Secondly,in the case that the user provides both positive and negative samples,the region-level adjacency matrix will be constructed via the feedback samples,and the manifold-ranking algorithm is also adopted here to choose instances which truly represent the user's query semantics.The selected instances are then used to retrieve the relevant samples. For the automatic image annotation,by analyzing the fact that the annotation issue exist ambiguity both in the input space and output space,the dissertation presents a novel semi-supervised multi-instance multi-label(SSMIML) learning framework,which aims at taking full advantage of both labeled and unlabeled data to address the annotation problem.Specifically,a reinforced diverse density algorithm is applied firstly to select the instance prototypes(IPs) with respect to a given keyword from both positive and unlabeled bags.Then,the selected IPs are modeled using the Gaussian mixture model(GMM) in order to reflect the semantic class density distribution. Furthermore,based on the class distribution for a keyword,both positive and unlabeled bags are redefined using a novel feature mapping strategy.Thus,each bag can be represented by one fixed-length feature vector so that the manifold-ranking algorithm can be used subsequently to propagate the corresponding label from positive bags to unlabeled bags directly.For the image annotation refinement,most existing algorithms rarely take into account the fact that,for the samples relevant to a certain keyword,their typicalities or relevancy scores to the keyword are generally different.Inspired by the kernel density estimation,the dissertation proposes a confidence weight computation algorithm,which uses a real num between[0-1]to represent the sample's relevancy score to a certain keyword.Moreover,an improved Citation-kNN multiple-instance learning algorithm is proposed to solve the annotation issue.In contrast with the existing annotation algorithm which intends to learn an explicit correspondence between keywords and target concepts,the proposed method can directly annotate the keywords to the unlabeled images based on the lazy learning style approach.
Keywords/Search Tags:image retrieval, saliency analysis, relevance feedback, manifold-ranking, automatic image annotation, semi-supervised learning, multi-instance multi-label learning, typicality analysis
PDF Full Text Request
Related items