Font Size: a A A

Research On Large Scale Multi Label Image Classification Based On Deep Learning

Posted on:2020-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:W J YuFull Text:PDF
GTID:2428330572983641Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Humans have long strived for creating artificial intelligence with self-awareness and self-thinking.Currently,one of the most important goals of artificial intelligence is to enable computers to recognize and understand image content.This technology enables computers to effectively understand the most common information in nature,and understand users more precisely.On the one hand,images are the basis for human beings to understand the world and change the world.Depending on computer vision technology,we can clearly understand and express the semantic information transmitted by images.This can not only help people understand their behavior better,but also can improve people's daily life and promote social progress and development.Image classification is one of the most meaningful problems in computer vision.Image classification is the basis of other computer vision tasks,such as target location,detection and segmentation.According to the number of labels associated to each image,the classification problem can be divided into two categories:single-label and multi-label classification.The former in which only one label is associated to each image,has been extensively studied in the past several decades.Recently,with the success of deep Convolutional Neural Network,deep single-label methods have demonstrated significant performance gains over the traditional methods using hand-crafted features.The latter,i.e.,the multi-label classification,aims to tackle the scenarios that one image is associated with more than one label.The multi-label problem widely exists in real-world applications and is more complicated than the single label one.Therefore,it is more practical and challenging than the former and has attracted more and more attention in the last decade.This paper studies large scale multi label image classification based on deep learning.Recently,the deep learning based multi-label classification methods have demonstrated promising performance;however,there are still several issues that need to be further explored.First,in an image with multiple labels,the objects usually locate at various positions with different scales and poses.The direct use of the global features extracted by pretrained deep neural networks may lead to sub-optimal performance on multi label images.Second,some labels are associated with the entire image instead of a small region,therefore,the methods only leveraging deep local features may ignore the global information.Third,some methods try to combine both the local and global features;however,they usually fail to extract sufficient local and global information and cannot combine these features effectively.To effectively extract and make full use of these information,in this paper,we present a novel deep Dual-stream nEtwork for the muLTi-lAbel image classification task,DELTA for short.As its name indicates,it is composed of two streams,i.e.,the Multi-Instance network and the Global Priors network.More specifically,in the Multi-Instance network,a well-designed Spatial Pyramid Convolutional transfer layer is proposed to extract multi-scale instances,followed by a sub-concept layer learning a scoring function for matching scores between the instances and sub-concepts;thereafter,a multi-instance pooling layer containing Average pooling and Log-Sum-Exp pooling is used to make the label predictions on the bag level.The Global Priors network is used to capture the global feature priors from an entire image and make the global prediction.Finally,the fusion layer combines the dual streams in order to obtain more comprehensive scores to make the final prediction.Extensive experiments on three benchmark datasets,i.e.,PASCAL VOC 2007,PASCAL VOC 2012 and Microsoft COCO,demonstrate that DELTA significantly outperforms several state-of-the-art methods.Moreover,DELTA can automatically locate the key image patterns that trigger the labels.
Keywords/Search Tags:Deep Neural Network, Dual-stream Network, Multi-label Image Classification, Multi-instance Learning
PDF Full Text Request
Related items