Font Size: a A A

Research Of Optimized Visual Bag Of Words Model Based On Spatial Structure And Quantitative Relation

Posted on:2018-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:K DingFull Text:PDF
GTID:2348330542492635Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,the number of images has exploded,and image classification has become a new hotspot in the field of computer vision.The purpose is to classify specific images into a specific semantic category.In recent years,visual bag model has been widely used in the field of image classification because of its simple and efficient features,and it has achieved great success.However,there are still some problems,mainly in the construction of image visual word features and spatial location of the image area and other key parts.In this paper,based on the visual bag model image classification method,the key steps of visual bag model are improved,and the accuracy of image classification is improved.The main contents are as follows:1.The research background and current situation of image classification technology are expounded.The image classification theory and key technology based on visual word bag model are deeply studied.The feature extraction,visual dictionary construction,feature coding and classifier algorithms are described in detail.The basic principles of some classical algorithms are described and the characteristics of these algorithms are analyzed.2.The traditional bag of words model can't distinguish the foreground and the background of the image,ignores the relationship between the image features,and the visual word is usually generated by the unsupervised clustering method,it is difficult to be associated with the specific semantic content,There is a quantization error.Based on the visual bag model,this paper presents a new image classification method based on the region of interest and the quantitative relation.Firstly,the image is segmented into blocks based on grid partitioning strategy,and the region of interest is extracted by using Shi-Tomasi corner detection and salient region extraction method.The SIFT feature is extracted based on the region of interest to eliminate the background interference.In the local feature quantization and coding stage,the feature selection is performed by using the soft assignment strategy and the fitting test method to reduce the influence of visual word ambiguity and ambiguity,and reduce the dimension of the visual feature.Experimental results show that the proposed algorithm achieves good classification accuracy.3.In traditional bag of words model,the statistical method of visual words ignores the spatial information and object shape information which is lacking the ability to distinguish between image features.In this paper we proposed an improved bag of words method,it combines with the salient region extraction and visual words topological structure which is not only can produce more representative visual words,to some extent,but also can avoid the disturbance of complex background information and the position change.First of all,the significant areas of training image are extracted and the bag of visual words model is built on the significant area.Secondly,in order to describe the characteristics of the image more accurately and resist the changing location and the influence of background information,the strategies of visual words topological structure and Delaunay triangulation method are utilized which integrated into the global information and local information.Through simulation experiments and compared with the traditional bag of words model and other models,the results demonstrate that the proposed method obtained a higher classification accuracy.
Keywords/Search Tags:Image Classification, BoW, Region of Interest, Location-constrained Linear Coding, Delaunay Triangulation
PDF Full Text Request
Related items