Font Size: a A A

Research On Object Categorization Based On Bag Of Words Model

Posted on:2013-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2248330377458662Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Image Categorization is one of the fundamental problems in image analysis andunderstanding. Recently, more and more digital images have appeared in our life with the fastdevelopment of Internet. The problem of how to categorize the giant image informationrapidly and accurately in order to search for the useful information has become a researchfocus. The Bag of Words(BoW) model was originally applied in document categorizationarea extensively because of its simpleness and usefulness. The main thought is representingthe document as a histogram of a series of unordered keywords. Researchers in computervision area tried to transplant the same thought to image processing and recognition whichlead to a transition from document processing to image processing. The Bag of Words modelwas applied in image categorization area in this paper. Based on the study of the technical,improvement has been done in order to overcome the disadvantages of the model.First, based on the study of image feature extraction, an improved multi-scale descriptornamed DF-SIFT(Dense Fast SIFT) descriptor was proposed in order to overcome the defectsof the traditional Scale-Invariant Feature Transform(SIFT) descriptor, such as limited interestpoints, high complexity and limitation when used in BoW model. The DF-SIFT descriptorextracts dense features in uniform interval pixels. Every feature is described using multiplescales so that the image information can be used sufficiently and also remain scale invariant.Moreover, different from the traditional SIFT descriptor, the rectangular window is usedinstead of Gaussian window to smooth the images. The scales are assigned preliminarily toavoid of the complicated calculation which aims to improve the efficiency. Optimal parameterselection was accomplished through experiments which can improve the categorizationaccuracy and meanwhile ensure the efficiency.Second, based on the analysis of the codebook generation method, a k-means clusteringmethod with stable initial center distribution was used to generate the codebook. The triangleinequality was used to simplify the calculation. The proposed method overcomes the problemof excessively relying on initial center selection which is existed in the traditional method. Itcan avoid of the effect of local optimization on algorithm performance. The iteration times arealso reduced to improve the efficiency. Moreover, a method based on weight distribution torepresent codeword histograms was proposed in this paper. The proposed method distributes different weights according to the distance between features and codewords. Then sum all theweights as the image histogram representation. The results show that the proposed methodcan improve the categorization accuracy. At last, the effect of the codebook size on theperformance was analyzed.Third, a method based on the combination of Region of Interest (ROI) selection andPyramid Matching Scheme was proposed. This method extracts ROI from the training imagesfirst and generates codebook using the features extracted from ROI. Therefore, the codebookis more representative and can describe the images more accurately. It also can resist theimpact of the various position information as well as background. Then apply PyramidMatching Scheme to represent the images which can use the spatial information of the imageto improve the matching accuracy. Experiments were carried out to analyze the differentperformance with different pyramid division. The results show that the proposed methodwhich combines ROI extraction and Pyramid Matching Scheme together performs better thanthe traditional BoW model. At last, we compare the whole method with the state of the art.The results have proved the advantage of the proposed method. The superiority-inferiority ofthe proposed method was analyzed in detail in the end.
Keywords/Search Tags:Image categorization, BoW model, SIFT descriptor, k-means clustering, ROIextraction
PDF Full Text Request
Related items