Font Size: a A A

Based On Multi-level Convolution Feature Pyramid Fine-grained Food Image Recognition And Mobile Application

Posted on:2019-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:H D LiFull Text:PDF
GTID:2428330563996019Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Food is closely related to human life and is the main source of human energy.Using computer vision technology to recognize food can greatly promote the convenience of life.Food image is challenging due to its subtle local inter-class differences versus large intra-class variations and it is hard to define parts.At present,in the field of food image recognition,a large number of works have been done on food identification from different perspectives.But the methods of food image recognition still have some problems such as low recognition accuracy and poor generalization.This paper proposes a fine-grained food image recognition model based on multi-level convolutional feature pyramids.It extracts features step by step from the whole to the local level,avoiding the disadvantages of the previous method that only pay attention to the overall characteristics of the food picture.t not only retains the global information,but also integrates local information,which discards background information and extracts features from the food target area.The model consists of three parts: food feature extraction network,attention region location network and feature fusion network,which are responsible for feature extraction,local region localization and feature fusion.The single-level food feature extraction network can not obtain the global and local features of food pictures at the same time,the features transfer from global to local through cascading three-level food feature extraction network.In view of the large variation in the food image scale,a feature pyramid network is constructed between the feature maps of each food feature extraction network to improve the network's feature representation,and relative accuracy gains of 2.1%.Designing an attention region location network locate fine-grained region automatically,and shrink region from global to local.Then,the fine-grained region of the original picture is cropped and enlarged to input the next-level feature extraction network.Finally,the features extracted from each feature extraction network achieve feature fusion in feature fusion network which include both the global features of the food picture and the detailed features of the food object.The experiment results achieve the best performance,with Top-1 accuracy gains of91.3%,82.6%,90.1%,on Food-101,ChineseFoodNet,and Food-172,respectively.And this paper establishes a large-scale food image dataset.
Keywords/Search Tags:Food Recognition, Convolutional Neural Network, Attention Network, Fine-grained Recognition, Feature Pyramid
PDF Full Text Request
Related items