Font Size: a A A

Part Based Method For Fine Grained Image Classification

Posted on:2019-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:B YuFull Text:PDF
GTID:2428330566996871Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Image classification is a basic research problem in the field of computer vision,which has always been paid much attention.With the rapid advances of deep learning technology in recent years,the problem of basic-level image classification has been well solved.However,with the development of society and the increasing demand of human for retrieval and classification,fine-grained image categorization gradually replace image classification as the next research topic.Fine-grained image recognition such as vegetable classification,clothing classification and animal sub-category classification has broad application prospects.Due to the low inter-class but high intra-class variations,traditional categorization algorithms have to depend on a large amount of annotation information,which not only wastes material resources,but also exists errors.To solve the above problems,this paper proposes a part detection based method for fine-grained image classification.The algorithm first improves the general image classification model and obtains better classification accuracy on the CUB-200-2011 dataset.Then,in the training phase,the original image is used as an input,and the parts or landmarks that are manually labeled as ground truth.The localization network is trained in conjunction with the spatial transformer network to detect the position of the parts or landmarks.Secondly,the manually classified parts are used as input,and the image labels are used as ground truth to train the classification network.After that,the localization network is connected with the classification network.The classification network is separately trained while the localization network is fixed.At the same time,a new loss function is introduced into the localization network to limit the overlap ratio between the predicted parts and the ground truth,so that the position of the parts can be adjusted adaptively by the classification result and the constructed loss function,and then the adjusted localization network can be fixed and the classification network can be fine tuned.Finally,the feature of the last two convolution layers of different parts in the classification network is extracted as a feature descriptor,and the feature descriptors of different parts are connected as the final feature representation of the image.The classification training is performed and a relatively high classification accuracy is obtained.Based on the original algorithm,adding the number of parts and using different classification network models for different parts can further improve the classification accuracy.The algorithm only needs to use annotation information in the training process,thus reducing human consumption to some extent.At the same time,dynamically adjusting the localization network,increasing the number of parts,and using different classification network models for different parts can make the classification result more accurate.Finally,this paper embeds the proposed algorithm model into a fine-grained image classification system for extended application.
Keywords/Search Tags:general image classification, spatial transformer network, landmark detection, part detection, joint model
PDF Full Text Request
Related items