Font Size: a A A

3D Semantic Mapping Method Of Search And Rescue Robot Based On Deep Learning

Posted on:2021-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:B Y LiFull Text:PDF
GTID:2428330614470441Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Nowadays,search and rescue robots are widely used in reconnaissance and search and rescue tasks such as war,natural disasters,and NBCR.Unlike common indoor structured environments,various obstacles and unstructured terrain in these complex indoor environments pose a major threat to the autonomous movement of search and rescue robots.If robot can't accurately recognize various elements in the surrounding environment and obtain their corresponding position contour information,it may lead to incorrect motion control decisions,put the platform in danger.Vision,as the closest way to human perception of the environment,can provide very rich information such as texture,arrangement,and distance.In this paper,Visual methods are used to solve the semantic element cognition and position contour perception of complex environments,and thus achieving 3D Semantic mapping in strange and complex environments,which has important theoretical significance and practical value.This paper uses unsupervised feature encoding,deep convolutional neural network feature optimization,and gridded image processing to solve the problem of unstructured terrain recognition.The search and rescue robot's 3D semantic mapping capabilities are increased by combining the optimized image segmentation and indoor 3D reconstruction algorithms.The following innovative results have been achieved.(1)A novel terrain image feature extraction and optimization process is established,and a set of indoor scene segmentation algorithm frameworks covering more elements is designed.A topographic image processing of optimal layer feature extraction and unsupervised feature coding was established by merging deep layer convolutional neural network(CNN)optimal layer features and Fisher Vector(FV)feature coding process,called the Aggregated Deep Fisher Feather(ADFF).This scheme effectively bridges the semantic gap between localized,low-semantic-level visual features and globalized,high-semantic-level visual features,and establishes a link between the statistical distribution of underlying features and higher-level semantics.Through extensive comparison of various convolutional neural networks,unsupervised feature coding,and scale integration schemes,it is found that the algorithm's performance is best when it is based on Res Net50 as the reference network,FV as the feature encoding scheme,and 128×128,224×224,and 256×256 three scale fusions.It achieved a classification accuracy of 98.81% on the international universal data set UCM,and subsequently achieved a classification accuracy of 93.25% on the Academy of Military Sciences self-built terrain data set Terrain8.Finally,the final ADFF-based image segmentation algorithm framework is generated by merging indoor and surrounding image segmentation algorithms with grid-based terrain image segmentation algorithms.This algorithm not only improves the accuracy of traditional indoor image segmentation,but also improves the segmentation ability of indoor complex terrain.(2)The end-to-end terrain image processing algorithm is improved,and a set of end-to-end indoor scene segmentation algorithm framework with higher time efficiency is designed.By adopting a completely end-to-end retraining CNN framework and adopting the method of layer-by-layer parameter forgetting and re-training,it effectively drives hundreds of layers of CNN networks within about 2000 images,and thus the balance point between scene specificity and limited data.Through extensive comparison of various benchmark CNN frameworks such as Res Net and Goog Le Net,the applicability of the optimal activation model in the field of terrain recognition is demonstrated,and the optimal activation layer positions of the above common frameworks are summarized.In the international universal data sets UCM,WHU-The classification accuracy of 99.0%,98.8%,and 99.4% and time efficiency of 20 Hz were obtained on RS19 and SIRI-WHU,respectively.Subsequently,the classification accuracy of 97.08% was obtained on Terrain 8.The performance is better than various known terrain recognition algorithms.Finally,by combining the indoor and surrounding image segmentation algorithm with the grid image-based terrain image segmentation algorithm,a BAM-based image segmentation algorithm framework is finally generated,which further improves the cognitive ability of indoor scenes.(3)Constructed a set of indoor 3D semantic mapping algorithm framework based on RGBD and verified its application ability in real scenarios.The implementation process of the existing 3D reconstruction algorithm is introduced in detail,and the existing monocular,binocular,and RGBD solutions are compared and analyzed in depth.According to the requirements of the indoor environment,the least computational pressure and the highest accuracy of dense maps,Elasticfusion 3D reconstruction solution,are selected.Next,the indoor 3D reconstruction algorithm based on Elasticfusion and the indoor image segmentation algorithm based on BAM were fused to effectively complete the 3D semantic mapping task.The comparison tests were performed on two internationally universal datasets of NYUv2 and ICL-NUIM.The results show that the algorithm proposed in this paper can further improve the scene recognition ability compared with the previous scheme.Finally,the software,hardware and algorithm are integrated,and the field application performance test is performed in laboratory environment.The experimental results show that the three-dimensional semantic reconstruction algorithm proposed in this paper can be applied to recognize semantic elements,wounded people and surrounding objects in indoor environments.The overall 3D reconstruction performance is greatly improved compared to the traditional algorithm.Above all,this paper carried out research on visual 3D semantic mapping methods,focusing on the problem of insufficient scene recognition ability of search and rescue robots in indoor complex terrain environments,and formed a set of usable indoor complex environments 3D semantic mapping system.
Keywords/Search Tags:3D semantic mapping, Aggregated Deep Fisher Feature, Best encoding model, Search and rescue robot
PDF Full Text Request
Related items