Font Size: a A A

3D Scene Representation And Object Detection Based On Visual Information And Application For The Visual Aid Of Visually Impaired

Posted on:2021-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2404330632950611Subject:Engineering
Abstract/Summary:PDF Full Text Request
The development of information technology has enabled human beings to access richer information.However,for some people with disabilities,it is extremely difficult to obtain the information to meet the basic life.For the visually impaired,the perception of environmental information is the main requirement for independent living.The development of computer vision provides new ideas for assisting visually impaired people.However,compared with the positive progress in the field of autonomous driving,there is relatively little research on the assistance of the visually impaired.In view of this situation,this paper is devoted to studying the environmental perception and obstacle avoidance schemes for the visual aids of the visually impaired people.In order to facilitate the travel of the blind,we mainly study 3D scene representation and obstacle detection under outdoor unrestricted conditions.In the task of obstacle detection,2D object detection has developed maturely,but only the position and category information of the bounding box of the object on the 2D image can be obtained,which is not enough for the actual obstacle avoidance.Besides,the research of 3D object detection based on RGB images are not good enough for real applications.And the real-time and accuracy of the algorithm need to be improved.Therefore,in this paper,we proposed a lightweight 3D obstacle detection framework YOLOv3D based on YOLOv3,which can simultaneously predict the three-dimensional information of the object type,2D bounding box,and object orientation and size,and then use the projection relationship to calculate the position of the object in the real world,so that 3D object detection based on monocular image is realized.The stixel algorithm used in the field of autonomous driving uses a binocular camera to obtain disparity information,and achieves intermediate representation of a three-dimensional scene through front and back scene segmentation.It has a small consumption of calculation and is robust.However,it only contains the depth information rather than the semantic information.The semantic segmentation provides a unified recognition method for multiple types of objects in the scene,and many network structures such as ERFNet can be implemented on small processors and can meet real-time requirements.This paper combines semantic segmentation and stixel algorithms to obtain semantic stixel representations of 3D scenes under unrestricted conditions.To sum up,the main content of this work is as follows:First is proposing a lightweight 3D objects framework,which will predict the object’s category,orientation,displacement,size and other information from monocular RGB images.Second,a semantic stixel algorithm is designed in combination with semantic segmentation technology to achieve three-dimensional world scene representation under unrestricted conditions.Besides,a wearable visual aids system is designed,which convey the 3D and class information of the obstacles in the scene to the user in an effective way.
Keywords/Search Tags:Visual Aids, Unrestricted Conditions, 3D Object Detection, Semantic Stixel
PDF Full Text Request
Related items