Font Size: a A A

Multi-feature Image Classification Based On Visual Bag Of Words Pyramid

Posted on:2021-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y W ZhangFull Text:PDF
GTID:2428330611971409Subject:Control engineering
Abstract/Summary:PDF Full Text Request
In the image classification of the visual word bag pyramid model,the final representation of the image is a visual word frequency histogram,it does not fully consider the spatial information of the image,the similarity of images of the same category and the differences of images of different categories are not fully expressed in the model;After building the visual dictionary,the algorithm directly encodes the local feature descriptor,It does not optimize the visual dictionary,which causes the bottleneck of the classification accuracy of the model image.This paper proposes a multi-feature image classification of the visual bag of words pyramid model,which is perfect in both feature fusion and feature coding to improve the accuracy of image classification.The main research of this article is as follows:For the spatial information ignored by the visual word bag pyramid,this paper presents the descriptor direction features of LSC(Localized Soft-assignment Coding)encoding,first we calculate the visual words used in the descriptor encoding,then look for another local feature descriptor encoded with the same visual word,finally,we extract the positions of the two local feature descriptors to form directional distribution features,the purpose for the arrangement is to improve the spatial characteristics of the visual bag of pyramid model.Further aiming at the spatial relationship ignored by the visual bag-of-words pyramid model,this paper proposes the edge direction feature,we first performs edge feature extraction on the image,we represent the edge features in the form of directional distribution,we enhance the discrimination between different types of images through feature fusion,we improve the global information of the visual bag of pyramid model.For the relevance of visual words ignored by the feature encoding of the visual bag of pyramid model,this article studies visual word screening,the correlation coefficient between visual words and visual dictionary is proposed.The lower the correlation coefficient,the higher the discrimination.We select visual words with high word discrimination to encode the local feature descriptors,then extract the descriptor position and visual word histogram,improving the image classification performance.For the randomness of the visual word formation process of the visual word bag pyramid model,this paper proposes a dual visual word screening model.Since the visual words come from the clustering algorithm,we use two different initial cluster centers and iterations to improve the randomness and uncertainty of visual words.The double visual dictionary is to complete the visual word bag pyramid model,with visual word histogram,descriptor direction features and local position features.Their purpose is to improve the image classification.We verified on three commonly used data sets: MSRC,Caltech101,and 15 Scene.They have achieved 3.6%,1.5% and 1.3% image classification accuracy improvements,Experiments also show that the descriptor direction,edge direction features,visual dictionary filtering and dual visual word filtering proposed in this paper improve the accuracy performance.It further verifies the feasibility and effectiveness of the multi-feature image classification in this paper to improve the two aspects of feature fusion and feature coding.
Keywords/Search Tags:Image classification, Bag-of-visual-words, Descriptor direction features, Edge direction features, Visual dictionary filtering
PDF Full Text Request
Related items