Font Size: a A A

Research On The Theories And Methods Of Object Detection In Complex Scenes

Posted on:2020-03-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:W LiFull Text:PDF
GTID:1368330596475767Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
In the field of computer vision,object detection is a classic topic and research hotspot,it has been widely used in image understanding,video monitoring,human-computer interaction and other aspects.In recent years,with the massive emergence of multimedia data and the use of deep learning techniques,a variety of algorithms have been proposed,which greatly promoted the development of object detection.However,the actual scene is complex and variable,and the objects are either small and dense,or the scale changes greatly,or they are highly occluded from each other.These factors lead to the challenges of object detection in core issues such as local module construction,feature representation,and model architecture.It is urgent to develop efficient and accurate object detection theories and methods for visual understanding and multimedia applications.Therefore,this thesis studies the theories and methods of object detection in complex scenarios.Aiming at these challenges,this thesis firstly explores the generation of object proposals and the end-to-end detection model based on the existing object detection model.Then,this thesis further focuses on several issues such as object context information and the generalization of existing methods,and studies the object detection and identification method.Specific research contents and contributions include the following aspects:Firstly,due to the lack of effective guidance in using single cue to generate object proposals,this thesis studies the generation of object proposals based on multiple cues.Specifically,the feature similarity function is first defined and image-level ranking is performed.Then the spatial position matching degree of each object proposal is calculated.The proposed method is capable of producing high quality object proposals and improving the confidence scores of object boxes,effectively eliminating background interference.Secondly,due to the high image resolution,small and dense objects in complex scenes,the detection performance is insufficient.This thesis constructs the expected mean square loss by characterizing the matching degree between anchor boxes and labeled ground-truth boxes.Furthermore,considering the richer semantic information,the feature fusion strategy based on attention mechanism is proposed,and the feature extraction method of small and dense objects is constructed.Meanwhile,the supervision of object quantity is introduced,and the multi-task loss function based on counting regularization is proposed to further improve the performance of dense and small object detection in complex scenarios.Thirdly,aiming at the challenges caused by severe occlusion between objects in crowded scenarios,this thesis studies the object detection method based on context information,and proposes an end-to-end adaptive relational network for object detection.Specifically,a local structural module is constructed to model the object stability of an individual.A global adaptive module is designed to describe the differences of multiple objects.This model can effectively detect the heads in crowded scene and can be extended to face detection.Fourth,for object detection task in cross-domain scenarios,most existing deep convolutional networks tend to have the problem of “catastrophic forgetting”.Therefore,this thesis studies the object detection method with transferable memory ability.An end-to-end memory neural network model is proposed by designing the ranking function and mining memory neurons.Whether in the case of a single class or a multi-class cross-domain,the proposed method has the ability to memorize the original object information.Fifth,aiming at the cross-camera scenes,this thesis studies the task of object detection and specifically identifies the label-consistent objects.By modeling the semantic region ensembles and the multi-region similarity measurement,we propose a cross-scene object detection and identification method based on multi-region-set ensembles.This model considers the appearance information of individual semantic regions and their relationship,which can overcome the shortcomings of horizontal stripe division.The proposed method can still detect the label-consistent objects in the case of appearance,viewpoint,pose,and illumination changes.
Keywords/Search Tags:Object Detection, Deep Learning, Deep Convolution Network, Object Recognition, Feature Learning
PDF Full Text Request
Related items