Font Size: a A A

Research On Region Based Two-Stage Deep Object Detection Models

Posted on:2022-11-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:S WuFull Text:PDF
GTID:1528306839479534Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of the deep convolutional networks in the computer vision field,the deep network based object detection models have become the mainstream methods in the object detection field.In particular,the region based two-stage deep object detection framework transfers the object detection task to the classification and regression tasks for the candidate proposals,and achieves the inspiring detection performance.Such framework normally contains two parts: the candidate proposal module(the first stage)and the ROI processing module(the second stage).In terms of an input image,the candidate proposal module will firstly generate hundreds of candidate proposals.Then,these proposals will be fed into the ROI processing module to be further classified to discriminate whether they contain the target objects,and to be performed bounding box regression to ensure better Io U(Intersection over Union)with the groundtruth.Although the two-stage object detection framework can achieve inspiring results in the object detection field,there still exist certain problems in the candidate processing module and the ROI processing module.Such problems can be summarized as follows: 1)the candidate proposal module cannot obtain any back propagation information related to the positions of the candidate proposals from the ROI processing module;2)the feature extraction process for the candidate proposals in the ROI processing module cannot effectively model object deformation;3)the loss function of the ROI processing module equally treat all the candidate proposal samples,which lead to limited training effectiveness.This dissertation mainly concentrates on dealing with these problems.Specifically,the main researches of this dissertation are as follows:(1)To address the problem that the candidate proposal module cannot obtain any back propagation information from the ROI processing module,we propose the back position interaction module for the candidate proposals.Such module records the entire process from the anchors to the candidate proposals,and also designs the novel interactive ROI pooling layer to finally establish a backward path from the ROI processing module to the candidate proposal module.The backward path can calculate the partial derivation of the ROI processing module with respect to the positions of a candidate proposal.A large amount of image information is applied for the calculation of such partial derivation.This means the candidate proposals can learn directly from the image content in the training process,which undoubtedly improves the training efficiency to large extent.The experimental results on the PASCAL VOC and MS-COCO benchmarks demonstrate that the back position interaction module for the candidate proposals can effectively improve the detection performance of various two-stage object detection models.(2)To address the problem that ROI pooling cannot effectively handle object deformation,we propose the deformable subnetwork to model object deformation.The deformable subnetwork can effectively introduce the traditional DPM method into a deep network.It contains two key parts: the deformation coefficient part and the deformation pooling part.The deformation coefficient part is responsible for generating the deformation coefficients.The deformation pooling part divides the candidate proposal into several bins,and then perform part alignment in different bins according to the deformation coefficients.Normally,the center positions of various bins are set as the anchor positions.To further improve the performance of the deformation subnetwork,we further design an improved version for the deformable subnetwork,which can generate the anchor positions for different bins through the network.The experimental results demonstrate that the deformable subnetwork can effectively handle object deformation.It can also help different two-stage object detection models improve the detection performance.(3)To address the problem that the deep network is difficult to effectively model the object deformation,we propose the deformable template network.The deformable template network can exploit a template to model an object in the deep network.The template models an object by virtue of a set of parts in a deformable way,which can effectively handle object deformation problem.Compared with the deformable subnetwork,the deformable template network is more flexible and effective.The deformable template network has two key modules: the template generating module and the part matching module.The template generating module is mainly responsible for generating a template for a target object,and the part matching module will perform part alignment based such generated template.The matching process will take both detection score and the deformation cost for each object part into account.The experimental results on the PASCAL VOC and MS-COCO benchmarks demonstrate that the deformable template network can achieve state-of-the-art detection performance in the object detection field.(4)To address the problem that the loss function of the ROI processing module equally treat all candidate samples,we propose the interval normalization weighting strategy.We firstly set the number of false positive detection boxes and the number of true positive detection boxes on MS-COCO validation set as the evaluation metrics.Then,we analyze how different weighting strategies have influence on such metrics.Based on the analysis results,we further design an interval normalization weighting strategy.Such weighting strategy contains two sub-strategies: Io U interval score normalization strategy and score interval Io U normalization strategy.These two sub-strategies are respectively corresponding to the negative samples and the positive samples.The experimental results on the MS-COCO benchmark demonstrate that the Io U interval score normalization strategy can effectively decrease the number of false positive detection boxes,and the score interval Io U normalization strategy can effectively increase the number of true positive detection boxes.Moreover,the interval normalization weighting strategy is mainly applied in the training stage,it will not affect the detection efficiency of the basic models.In a word,we propose different object detection models which can effectively solve the problems in different phrases of the region based two-stage deep detection framework.Multi-angle experimental analysis on PASCAL VOC and MS-COCO,which are the two mainstream object detection benchmarks,well demonstrates the effectiveness of the proposed object detection models in this dissertation.Object detection is the essential application research in the computer vision field,and many computer vision tasks are on the basis of object detection.Our research can effectively improve the detection performance for object detection,which is of great significance for the development of the computer vision field.
Keywords/Search Tags:object detection, back position interaction module, deformable subnetwork, deformable template network, interval normalization weighting strategy
PDF Full Text Request
Related items