| Object detection includes object classification and location tasks,which are widely used in real life,such as medical image processing,face recognition,image retrieval,automatic driving and other fields.Object detection is becoming the focus of pattern recognition and computer vision.Object detection algorithm based on deep learning is the most popular and effective nowadays.Compared with traditional algorithms,object detection based on deep learning has stronger ability to extract features and higher accuracy for complex scenes.In the paper,two-stage detector based on candidate region extraction is studied.Two-stage detector learns positive and negative samples randomly selected according to a fixed ratio,which results in the model being driven by most easily classified samples,and thus leads to a large number of missed positives.Secondly,the deep learning object detection has a deep network and a large down-sampling stride.Although the high-level semantic information is rich and beneficial to classification,the high-level edge information is less,which is not conducive to the boundary box regression of objects,and the large down-sampling stride results in the loss of small-scale objects feature information,which is not conducive to the detection of small-scale obj ects.Based on the above analysis,we study sample mining and deep network optimization based on Faster R-CNN algorithm framework.The main contents and innovations are summarized as follows:(1)Aiming at the imbalance of two-stage detector learning samples,we propose an E-OHEM(Enhanced Online Hard Example Mining)algorithm based on the improvement of OHEM(Online Hard Example Mining)algorithm,which solves the problem that OHEM algorithm only pays attention to difficult samples but completely ignores easy classified samples.The recognition accuracy is much better than that of learning positive and negative samples according to a fixed ratio.E-OHEM algorithm not only learns difficult samples but also learns easily classified ones,which solves the problem of sample imbalance and improves the recognition of small-scale objects.This algorithm is based on VGG16 feature extraction network,using E-OHEM algorithm,the mAP value of PASCAL VOC2007 dataset is 71.3%,which is 0.6%higher than OHEM algorithm.Secondly,according to the inaccurate classification of RPN stages in Faster R-CNN,we propose a fine-tuning model strategy,and further improves the mAP value to 71.6%with E-OHEM algorithm.(2)In order to solve the problem that the down-sampling stride of object detection deeper network is too large,which leads to the loss of small-scaled objects features and the poor bounding box regression caused by less edge information in high-level feature maps,we use dilated convolution to optimize the deep network in object detection.Firstly,we use Inception-ResNet v2,a better classification network,to improve the recognition accuracy.And then use dilated convolution to optimize the deep network models of ResNet101 and Inception-ResNet v2 respectively.We use the Inception-ResNet v2 network optimized by dilated convolution to achieve 77.0%of the mAP value in PASCAL VOC2007 dataset and 1.1%of the recognition accuracy is improved.Finally,the mAP value is further increased to 79.4%using the Inception-ResNet V2 deep network model optimized by dilated convolution using Soft NMS and multi-scale testing. |