Font Size: a A A

Research On The Theory And Methods Of Multi-object Detection

Posted on:2022-03-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y ChenFull Text:PDF
GTID:1488306728465094Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As a fundamental task in the field of computer vision,object detection has been widely applied in many different areas including object tracking,video analysys,automonous driving,and human-computer interaction.In recent years,with the success of deep neural networks in the computer vision,object detection has achieved rapid development.Plenty of object detection networks have been proposed,and the accuracy has been continuously improved.Due to the diversity of objects in terms of scale,viewing angle,appearance,and the complex data characteristics in actual application scenarios,it is still challenging to achieve high-accuracy object detection.In-depth study of the object detection tasks and the design of efficient networks are of great significance for promoting the development of computer vision.This dissertation conducts the research on the theory and method of multi-object detection.This dissertation first focuses on the network structure design for object detection.Aiming at three fundamental steps in the detection pipeline:feature extraction,classification and localization,research on network components is conducted.Then the dissertation moves on to the impact of complex data characteristics in application scenarios to the detection task,including the expandable number of categories and the cross-domain detection under scarce data in the target domain.Specifically,the research contents and innovations are as follows:(1)For the feature extraction step,the dissertation studies on the feature enhancement based on feature pyramid.To overcome the imbalanced information flow among pyramid scales and the low utilization of adjacent scale information in existing methods,the adaptive multiscale information flow is proposed.First,the information fusion module integrates features of adjacent scales efficiently,and then through the transition of feature interaction from adjacent scales to all available scales,the feature representation of all levels in the feature pyramid is further enhanced.(2)For the classification step,the dissertation concentrates on the influence of object proposal quality to classification results.The inaccurate information from low-quality object proposal limits the classification accuracy.Besides,the bounding box regression could affect the ground-truth category.An analysis of classification step is conducted in a probabilistic way,and consequently a novel classification pipeline is proposed in which both the proposal box and refined bounding box are adopted to extract object information to predict category.Accordingly,a multipath detection head is designed to enhance prediction accuracy by using boxes before and after bounding box regression simultaneously.(3)For the localization step,the dissertation focuses on the problem of localization quality in existing recursive detection methods.Based on the statistic analysis on training examples,a balance optimization stratety is introduced.The self-iterative box sampling can increase the diversity of training examples.The IoU-sensitive bounding-box regression module separately models the bounding box regression from proposals with different localization accuracy.These two components jointly ensure the effectiveness of bounding box regression from object proposals with various quality levels,and consequently improve the results of recurrent object detection.(4)To implement the category-extensible object detection,the dissertation considers using category-complementary multi-source data as training examples.This scheme could eliminate the need for additional annotation supplement.The cross-category-set object verification and mining is proposed to avoid the impact of missing annotations.A multiclassifier structure is designed to get rid of the false background labels as supervision,and the output of multiple classifiers are combined by a voting strategy for category prediction.Based on the multi-classifier structure,the same detection network is used to add pseudolabels to each subset in the multi-source data,which increases the number of labels and further improves the detection performance.(5)To improve the cross-domain object detection under the scenario of scarce targetdomain data,the dissertation implements feature domain adaptation in image and object features simultaneously,with the consideration of the imbalanced quantity between source and target data.For image-level features,a sample adaptive weighting strategy is proposed to optimize the domain classifier,avoiding the domination of source data to the optimization procedure.The idea of adversarial learning is adopted to achieve feature domain adaptation.For object-level features,the annotations from both source and target data are used to obtain the category prototype features,and the domain adaptation is realized by narrowing the distance of the prototype features in different domains.
Keywords/Search Tags:Object Detection, Deep Learning, Convolutional Neural Network
PDF Full Text Request
Related items