Font Size: a A A

Object Detection And Pose Estimation Based On Subcategory-aware Convolutional Neural Networks

Posted on:2019-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:T F LiFull Text:PDF
GTID:2428330590492234Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Object detection is one of the most significant research subjects in computer vision,which can be utilized in various applications such as intelligent security,smart home,intelligent transportation,medical image,etc..With the rapid development of deep learning in recent years,the performance of object detection has been achieved a great breakthrough and outperforms traditional methods in many public databases.Pose estimation is also one of important basic research topics,especially head pose estimation,which has been widely used in areas like 3D face model,human-computer interaction cooperation,virtual reality and so forth.Both of the two tasks also play key roles in autonomous vehicles,including car and pedestrian detection,future person localization,etc..Traditional methods of object detection utilized sliding windows to select candidate regions,thus is not explicitly targeted and generates redundant windows,leading to high time complexity.In addition,the manually designed features are not robust enough for the change of the target.The two problems can be solved by utilizing convolutional neural networks.However,methods based on CNN does not perform well enough on popular benchmarks.One of the main reasons is that the detection of targets in complicated background like truncation and occlusion is not accurate.In the meanwhile,accurate region-based methods are generally not fast.In this thesis,an accurate and cost efficient subcategory-aware deep CNN network for object detection is proposed.Our method is less time consuming while maintaining competitive performance on detecting objects.The proposed method provides a better trade-off between accuracy and speed on several databases.Head pose estimation is to infer the orientation of heads in an image,i.e.,to obtain three pose angles,pitch,yaw and roll.In this thesis we present a fast and unified framework for simultaneous face detection and 3D pose estimation of unconstrained faces using deep convolutional neural networks.Face detection is implemented with region-based framework as previous work like Faster RCNN combining with subcategory information.While subcategory information also helps to boost the performance of head pose estimation.Specifically,pose estimation is modeled as a classification and regression problem:first divide continuous head poses into several discrete clusters(each is a pose class,i.e.,a subcategory),then adjust the pose in each pose class with a pose-class-specific regressor to achieve more accurate results.All classifications and regressions for the two tasks are trained and tested simultaneously in one unified network.Our approach runs at 10 fps,which is the fastest implementation among the recent proposed methods as far as we know.Extensive evaluations on several challenging benchmarks demonstrate the effectiveness of the proposed method with competitive results.
Keywords/Search Tags:Subcategory, Object Detection, Head Pose Estimation, Convolutional Neural Networks
PDF Full Text Request
Related items