Research On Object Detection Method Based On Deep Learning

Posted on:2021-03-05

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Q S Lu

Full Text:PDF

GTID:1368330605981254

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Object detection is a fundamental and challenging topic in the field of com-puter vision.Its main purpose is to identify all objects in the image and locate them.Based on a large number of domestic and foreign researches,this thesis explores the difficulties and challenges faced by current object detection tech-niques based on deep learning,and proposes solutions from the following three perspectives:the translation invariance of a convolutional neural network,the receptive field of a convolutional filter,high-resolution feature map and feature fusion.Concretely,the contributions are summarized as follows.This thesis proposes a position-sensitive grid convolutional neural net-work.The state-of-the-art object detection networks rely on convolutional neu-ral networks(CNN)pre-trained on a large auxiliary dataset designed for the image-level classification task,and then,the pre-trained CNN is refined on the object detection dataset.The image-level classification task prefers translation invariance of CNN-when moving an object inside an image,there should be no discrimination between them.The object detection task needs localization representations of CNN that are translation variant to an extent-translating an object inside a candidate box should be discriminative and indicate how well the candidate box overlaps the object.The position-sensitive grid convolutional neural network includes a grid convolutional layer and a grid pooling layer.The grid convolutional layer outputs a feature map that is sensitive to specific posi-tions of the object.The output cells of the grid pooling layer alternately come from different feature maps.The grid convolutional neural network can control the sensitivity of the object translation through the grid types,solving the prob-lem that the translation invariance of CNN designed for the image classification task is too strong.The experimental results show that the G-CNN can improve the object detection performance and accuracy.This thesis proposes a new module to adaptively determine the receptive field size of a convolutional filter,named receptive field adaptive convolution.The receptive field size of a convolutional filter in a deep convolutional net-work is a crucial issue for object detection task,as the output must respond to a suitable size of the area in the image to capture proper information.The re-ceptive field size of the convolutional filter is fixed due to the inherently fixed geometric structure of CNN.However,objects of interest vary significantly in size within the images for object detection,and the high-level convolutional filters encode semantic features over spatial positions.Therefore,the adaptive determination of the receptive field size of the convolutional filter is desirable for object detection.The receptive field adaptive convolution can adaptively determine the receptive field size of the convolutional filter.It is based on the idea of dilating the convolutional filter with multiple dilation values,calculat-ing the convolutional value separately and selecting the maximum value as the output.The experimental results show that receptive field adaptive convolution can adaptively change the receptive field according to the object scale to extract better feature maps and improve object detection accuracy.This thesis presents an object detection architecture using convolutional networks with high-resolution feature map fusion.Large scale variations across objects and small object detection are the main challenges for object detection.The state-of-the-art CNNs have large strides that lead to a very coarse represen-tation of the input image,which makes small object detection challenging.The high-resolution feature map fusion module can increase the resolution of the top feature map by a factor of 4 and fuses multi-level feature maps while keeping the input image size unchanged.Besides,this method adaptively recalibrates channel-wise feature responses by explicitly modelling the interdependencies between channels.The experimental results show that the high-resolution fea-ture maps extracted by this method can improve the accuracy of object detec-tion,especially for small-scale objects.

Keywords/Search Tags:

Grid convolution, Receptive field adaptive convolution, High-resolution feature map, Deep learning, Object detection

PDF Full Text Request

Related items

1	Research On Multi-oriented Scene Text Localization And Detection Based On Multi-scale And Big Receptive Field Deep Learning Features
2	Research On Object Detection Algorithm Of Multi-Receptive Field Branch Network Based On Cascade Structure Improvement
3	Research And Implementation Of Small Objection Flaw Surface Detection System Based On Deep Learning
4	Research On Object Detection In Complex Scenes
5	Research Of Face Detection Methods Based On Deep Learning
6	Deep Convolution Neural Network And Its Application In Ground Image Target Recognition
7	The Research On Key Technologies Of Object Detection Based On Deep Convolutional Neural Networks
8	Research On Driving Behavior Recognition Algorithm Based On Deep Learning
9	Research On Crowd Density Estimation Algorithm Based On Deep Learning
10	Research On The Theories And Methods Of Object Detection In Complex Scenes