Font Size: a A A

Researchon The Application Of The RGB-D Camera In Indoor And Outdoor Scene Perception

Posted on:2021-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:D M SunFull Text:PDF
GTID:2428330632450638Subject:Optical Engineering
Abstract/Summary:PDF Full Text Request
The RGB-D camera is a category of important sensors.Compared with traditional color cameras,it provides both visible color images and dense depth images at the same time,thus providing rich data for advanced vision algorithms,attracting much attention in the field of intelligent perception.In the past few years,the emergence of consumer RGB-D cameras represented by Microsoft Kinect and ASUS Xiton has greatly promoted the development of robotics,augmented reality,virtual reality and 3D reconstruction.On the other hand,the deep learning represented by convolutional neural networks is in vigorous development,which has outstanding performance in tasks like object detection and image segmentation.At present,the combination of the RGB-D camera and deep learning has become a new research hotspot.Scene perception refers to the use of sensors to identify and locate important objects in the current scene.This paper focuses on the application of the RGB-D camera in indoor and outdoor scene perception,and explores how to combine the rich information provided by the RGB-D camera with image segmentation technology.A typical outdoor application of the RGB-D camera is autonomous driving.It can effectively improve the safety of autonomous driving by identifying and locating impor-tant objects such as vehicles,pedestrians and curbs through color and depth images.In this paper,a multimodal sensor composed of an RGB-D camera and a polarization cam-era is designed for autopilot.We train a semantic segmentation network which improves the segmentation accuracy for important object categories.Based on the fusion of seman-tic segmentation results and depth data,a curb warning algorithm is proposed,being able to detect the distance between the road edge and the vehicle in time through point cloud projection and the conversion of bird's-eye view.Based on polarization measurement and cross-modal data fusion,an efficient water surface detection method is proposed.Indoors,the RGB-D camera is mostly used for 3D reconstruction.However,tradi-tional 3D reconstruction methods pay more attention to the geometric aspect of the recon-structed map and lack semantic information in the map,which limits the application of indoor 3D reconstruction for human-computer interaction.Based on the frame to model correspondence,this paper proposes iFusion,which integrates the results of single-frame image segmentation into the map incrementally and enriches the semantic information of the map,providing the possibility for more intelligent human-computer interaction.The experimental results show that,through the correspondence among frames provided by se-quential RGB-D images,the map gradually integrates the results of multi-frame instance segmentation,which gains higher accuracy than single frame instance segmentation.
Keywords/Search Tags:RGB-D Camera, 3D Reconstruction, Scene Perception, Semantic Seg-mentation, Instance Segmentation
PDF Full Text Request
Related items