Font Size: a A A

Research And Improvement Of Object Detection Based On Region Based Convolutional Neural Networks

Posted on:2018-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2348330533469245Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the popularity of smart phones and the development of the Internet,images are being produced at a tremendous speed.In order to analyze and understand the images intelligently,it is necessary to detect the objects contained in it.Object detection has been a hot topic in the field of computer vision.In recent years,the progress of deep learning,especially the convolution neural network,has greatly promoted the development of object detection.At present,a feasible and effective method for object detection is Region-based Convolutional Neural Network(R-CNN).Firstly,R-CNN extracts region proposals from images that are likely to contain objects.Then a CNN is used to extract features on these region proposals.Finally,a class-specific classifer is used to classify each region.Faster R-CNN is the result of continuous improvement of R-CNN,which makes the whole object detection pipeline become an end-to-end process.However,there is still big room for improvement in average precision of object detection.This paper studies the basic flow and principle of Faster R-CNN.Considering the importance of feature extraction,we replace the VGG16 network of Faster R-CNN with Residual Network(ResNet)as our new benchmark.In this paper,by visualizing and analyzing the features of the different convolution layers of the CNN,we find that the low-level features and the high-level features are complementary.On this basis,we propose a multi-level features merge strategy.The lowlevel features are “down-sampled” to the same size as the high-level features using the convolution layer.And then merge them together using concat layer.This allows for a more effective feature representation of the original image.Aiming at the situation that the Faster R-CNN only uses regional features to classify and locate,we implement a context learning structure.The context features are extracted from the entire image's convolution feature maps using ROI pooling layer,and then merged with the regional features for classification and localization.Due to the Faster R-CNN only keeps the highest confidence bounding-box in each class' s results when post-processing,we introduce the bounding-box refine strategy.Considering the location information provided by the deleted bounding-box,the final bounding-box coordinates are refined by these boxes weighted by theirs score.We evaluate our improved Faster R-CNN object detection method on Pascal VOC public dataset.Experimental results show that our method can improve the object detection average precision effectively.On VOC2007 and VOC2012 we achieve a mean average precision(mAP)of 78.4% and 75.4% respectively.Finally,we evaluate our method on the ImageNet dataset,achieve a mAP of 53.8%.
Keywords/Search Tags:object detection, convolutional neural networks, multi-level feature, context learning, bounding-box voting
PDF Full Text Request
Related items