Font Size: a A A

Study Of Weakly Supervised Object Detection Based On Convolution Neural Networks

Posted on:2019-09-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:R G FuFull Text:PDF
GTID:1368330611492989Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Object detection is one of the fundamental issues in the field of computer vision.With the development of deep learning,many excellent object detection methods are proposed,but most of them are based on supervised learning.A major disadvantage of such methods is that the manual annotation of the objects in large image sets is generally expensive and sometimes unreliable.Therefore,supervised learning based object detection methods are often limited because of the high cost of labeling in practice.Weakly supervised learning aims to guide machine learning using relatively weak labels.In this thesis,we focus on the weakly supervised object detection(WSOD)problem where only binary image-level labels indicating the presence or absence of an object category are available for training.In this thesis,we study WSOD based on convolution neural networks(CNNs),adopting framework of“classification before detection”.The research is mainly divided into two parts:classification and detection.In the classification part,this thesis mainly explains how CNNs work,and proposes a novel network structure called coarse-to-fine CNN,which is able to improve classification accuracy.In the detection part,we mainly discuss the feasibility of detection with only classification CNNs but without retraining with any other annotation.The main achievements are outlined as follows:1.Under the condition of imbalanced data,an idea of fine-tuning convolution neural networks to improve classification accuracy is proposed.Imbalanced data typically refers to a problem with classification problems where the classes are not represented equally.In the case of original dataset cannot be expanded,we create a large-scale synthetic dataset,and improve classification accuracy by fine-tuning CNNs.2.A novel hierarchical CNN architecture,called coarse-to-fine CNN is proposed.By introducing the label tree and Bayesian theory,the classification accuracy of network is improved.Most of the traditional CNN-based classification models have an underlying assumption that all classes are equally difficult to distinguish.However,visual separability between different object categories is highly uneven in the real world.Coarse-to-fine CNN is simple with a proposed coarse-to-fine layer on the top of a generic CNN.The proposed coarse-to-fine layer is inspired by the Bayesian equation,where the coarse prediction can affect the fine prediction directly.Experimental results on the benchmark datasets MNIST,CIFAR-10,and CIFAR-100 show clear advantages over the compared baselines.3.A WSOD method based on sGrad-CAM~+is proposed,which directly detects objects with a classification CNN.Related works have shown that CNNs designed for image classification can automatically learn object activation maps.Unlike previous approaches,we provide a mathematical derivation with Taylor's theorem to generate sGrad-CAM.Based on the derivation,we further propose Grad-CAM~+and sGrad-CAM~+.sGrad-CAM~+is more stable than others due to the full use of gradient information.4.WSOD method based on gradient-based saliency map is proposed,which detects small targets effectively.The generation of gradient saliency map derive from linear approximation of a highly nonlinear function,therefore,the effectiveness needs to be verified.By introducing receptive field,we succeed in verification.Experiments on the Nexar traffic light dataset show that CNNs are able to successfully classify images even though the objects occupy only a few pixels in the training images and gradient-based saliency maps provide strong resistant capability to interference in the detection of small targets.
Keywords/Search Tags:weakly supervised learning, convolution neural networks, image classification, object detection, hierarchical classification, CNN visualization
PDF Full Text Request
Related items