Font Size: a A A

Software Design Of Vision System For Intelligent Service Robot Based On Embedded GPU

Posted on:2020-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:R YeFull Text:PDF
GTID:2428330596463711Subject:Control engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology and the continuous improvement of people's living standards,people's demand for service robots is increasing day by day.At present,service robots have been gradually used in many fields such as medical care,home cleaning,entertainment education and so on,which can help people to be free from the tedious daily work.In this thesis,an intelligent service robot vision system software is designed and implemented based on Nvidia's embedded GPU processor Jetson TX2 for home and office applications.The software has five functions: target object detection,face recognition,speech recognition,speech synthesis and binocular ranging,which can be applied in various types of robot vision system design and has good application value.The main research contents of this thesis are as follows:(1)System requirements analysis.By analyzing the software function and performance requirements of intelligent service robot vision system,Jetson TX2 processor with 256 GPU cores was selected as the software development platform.According to the platform features of Jetson TX2,the overall framework of the software is designed,and the software is divided into three modules: visual detection,human-computer interaction and visual ranging.The software development program is developed and the system development environment is set up.(2)Design of visual inspection module design.The visual inspection module detects nine types of items such as chairs,tables,and laptops,as well as faces and human figures,which are common in home and office scenes.In this thesis,after comparing the processing performance of three current mainstream deep neural network target detection methods,Faster R-CNN,SSD and YOLOv2 on the embedded GPU platform,we choose YOLOv2 with better comprehensive performance as the target detection method in.Then,nearly 22,000 images of 11 types of target objects were collected and made into training sets,and nearly 1,000 images containing target objects were taken and made into test sets.Then according to the detection requirements of this thesis,the input resolution of the network,the input and output of each layer in the network,the parameter anchor and the activation function are modified based on the YOLOv2 Tiny network,and the modified network is trained and tested.The average detection accuracy of each type of target is about 80%.Finally,this thesis accelerates the network optimization on the TensorRT reasoning framework,and the optimization test results show that the network running time is reduced by about 50% compared with the network before optimization.(3)Design of human-computer interaction module.The human-computer interaction module includes three parts: face recognition,speech recognition and speech synthesis.Face recognition part uses FaceNet neural networks to extract the Euclidean features of the detected face,and compares the Euclidean distances between the face to be recognized and each face in the pre-fabricated face database,so as to recognize the face.If the face in the face database is recognized,the voice interaction function is started.Speech recognition part is used to identify the voice of the interactive object and then get the instruction.The speech recognition of this thesis is realized by the online speech recognition interface of Keda Xunfei.Through this interface,the collected voice can be sent to the Keda Xunfei cloud server,and the cloud server is completed.After identification,download and parse the recognition result,thereby obtaining the user's instruction and executing it.The speech synthesis part is implemented using the Mimic speech synthesis library whose main function is to synthesize the information to the user into speech and play it to realize the feedback of the robot.(4)Design of visual ranging module.In the visual ranging module,a method of rapid ranging for the target object is proposed.The method is different from the traditional binocular ranging method.First,under the condition that the target object is detected in the left image in the visual detection module,the imaging region of the target object in the right image is predicted.Then the ORB features in the left and right regions are extracted and matched with the FLANN algorithm.After the matching feature pairs are obtained,the available feature pairs are further screened to calculate the parallax and the actual distance from the target object to the camera.
Keywords/Search Tags:embedded GPU, service robot, object detection, face recognition, binocular ranging
PDF Full Text Request
Related items