Font Size: a A A

Fine-grained Image Recognition Method Based On Improved Bilinear Pooling

Posted on:2021-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:G J WangFull Text:PDF
GTID:2428330605482489Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Unlike traditional image recognition tasks,the purpose of fine-grained image recognition is to distinguish different sub-categories of the same species.Fine-grained image recognition is a challenging task in computer vision community due to the subtle inter-class variations and large intra-class variations.Recently,owing to its widely application in practice,such as ecological protection,unmanned supermarket product identification,vehicle identification,etc.,fine-grained image recognition technology has gradually become a key technology in many fields and has unlimited value.It is also a research hotspot in the fields of computer vision and pattern recognition in recent years.Recently,the bilinear pooling models,which utilize only category labels without additional annotations in an end-to-end training manner,gradually become the mainstream model in fine-grained image recognition tasks.Inspired by the bilinear pooling models,we design two end-to-end fine-grained image recognition models from the perspective of reducing background noise and improving feature interaction.The proposed models outperform many state-of-the-art methods in fine-grained image recognition tasks and achieve excellent performance.The main research contents are as follows:(1)We devise a Hierarchical Bilinear Pooling with Aggregated Slack Mask(HBPASM)model to reduce the interference caused by background noise in image recognition.First,the ROI(region of interest)features are extracted from a single convolution layer using a mask model to suppress background noise.At the same time,in order to improve the fault tolerance rate when distinguishing the foreground and background area,we introduce a slack variable to generate a slack model for each layer,and obtain an improved unified image mask by aggregating multiple slack masks.The aggregated slack mask can not only extract detailed ROI features,but also ensure a hierarchical weight distribution for the feature vectors at different positions.This structure can be embedded into most convolutional neural networks with an end-to-end training.Consequently,the extracted multi-layer ROI features are integrated using a hierarchical bilinear pooling operator to generate the image representation for classification.Extensive experimental results demonstrate the effectiveness of the aggregated slack mask,and show that the proposed HBPASM model achieves excellent performance on fine-grained image recognition tasks.(2)We propose a Selective Hierarchical bi Quadratic Pooling with Multi-scale features(SHQPM)model.It aims at discovering the effective feature interaction manner automatically.We first extracted coarse-to-fine multi-level multi-scale features from the convolutional neural network to capture different levels of semantic information,and then proposed a novel hierarchical biquadratic pooling to effectively integrate these multi-scale features in order to obtain complementary inter-layer and intra-layer information.Finally,a sparse weight model is designed to discover the optimal feature subset for a specific dataset.Each component of the proposed model,i.e.,hierarchical biquadratic pooling,multi-scale feature interaction,and adaptive feature selection,was verified in the experiments.Compared with other fine-grained image recognition approaches,the proposed SHQPM can achieve state-of-the-art performance.
Keywords/Search Tags:Deep Learning, Fine-Grained Image Recognition, ROI Feature, Feature Fusion, Sparse Weight
PDF Full Text Request
Related items