Font Size: a A A

Object Detection In Indoor And Outdoor Environments

Posted on:2020-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:Q H LuoFull Text:PDF
GTID:2428330572469956Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Mobile robots need to perceive and understand their surroundings to facilitate autonomous operations and accomplish expected tasks.Object detection is one of the hot spots in environment perception which aims to provide locations and categories of objects in the real world.Never-theless,recent researches still need to be improved to satisfy the demands for the efficiency and accuracy of pratical applications.This thesis conducts a research on object detection in both indoor and outdoor environments,which involves three modules of indoor object detection from RGB-D images,outdoor object detection from RGB images and domain adaption for object detection.The main contributions are as follows:1.An end-to-end fast one-stage object detector is proposed,which hierarchically fuses multi-modal features.It concatenates complementary appearance and geometric features extracted from RGB and Depth images,and utilizes multi-scale feature layers to fulfill the final pre-diction.On the prediction stage,a set of 3D anchor boxes with varying sizes are attached to every location of the prediction layers,the initial poses of which are determined by the phys-ical information from the depth image.Then the detector is trained to regress their poses and classify their classes,and the final detections are decided by non-maximum suppres-sion.During the training,positive samples are identified with the aid of the 2D ground truth,which contributes to a better converged model.Experimental results on the publicly used SUN RGB-D dataset suggest the proposed approach outperforms the state-of-the-art method by 10.2%in mAP with 109 x faster.2.A real-time two-stage object detector is proposed,which adopts Region of Interest Align to improve the accuracy and employs network compression along with acceleration to increase the efficiency.Region Proposal Network is used to generate region proposals,then Region of Interest Pooling is replaced with Region of Interest Align to extract features of propos-als,which are later fed to fully-connected layers to conduct the classification and location regression.The use of Region of Interest Align improves the accuracy of car and pedestrian detection by 12.2%and 6.8%respectively.In order to improve real-time performance,a channel prunning method is employed to compress the feature extractor,1×1 convolutional kernels are utilized to halve the channels of the feature layer which is output of the feature extractor and quartered the channels of fully-connected layers.The frame rate of car and pedestrian detection increases from 14 fps to 27.7 fps at the cost of 2.1%and 1.5%seperate declines in the accuracy.3.A joint feature-level and pixel-level domain adaption algorithm for the two-stage object de-tector is proposed.On the feature level,a domain classifier based on adversarial loss is appended to the feature extractor;On the pixel level,multiple feature layers from the feature extractor are connected to a generator which feeds the fused features in a form of pictures to a later domain classifier.When used alone,either one can improve the performance of object detection in a adversarial training manner,the former acheives a high recall-rate while the latter gains a high precision-rate.While applying both components can play the role of complementary advantages.After domain adaption,the accuracy of the detector on the publicly used Virtual KITTI dataset increases from 51.2%to 69.5%,the accuracy on the pub-licly used Foggy Cityscapes dataset increases from 20.2%to 30.3%which outperforms the state-of-the-art method by 1.3%in the increment of accuracy.
Keywords/Search Tags:Multi-modal Feature, Object Detection, Network Compression, Network Acceleration, Domain Adaption, Adversarial Training
PDF Full Text Request
Related items