Font Size: a A A

Visual Place Recognition With Deep Convolutional Neural Networks For Mobile Robots

Posted on:2018-04-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y HouFull Text:PDF
GTID:1368330623950364Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Visual place recognition is one of the core and basic technologies of mobile robots(such as driverless vehicles,intelligent service robots,etc.)to realize visual-based autonomous navigation and localization.With the increasing demand for long-term autonomy of mobile robots and the popularity of cost-effective visual sensors,vision-based navigation has been paid more and more attention in the fields of mobile robot research.As a result,visual place recognition has become a cutting-edge technology as well as a hot research topic in related areas including robotics and computer vision.However,the robustness and real-time of existing visual place recognition technologies are far from the practical applications.Motivated by these challenges,this dissertation comprehensively investigates the techniques of ConvNet(short for ”Convolutional Neural Network”)-based visual place recognition for three typical application scenarios happened in visual-based autonomous navigation and localization of mobile robots.Specifically,the three application scenarios include moving along a fixed trajectory in a known environment,moving along a free trajectory in a known environment and moving along a free trajectory in an unknown environment.According to the typical framework of a visual place recognition system,three key techniques,i.e.,visual place description,visual place remembering and visual place matching,are systematically studied.Furthermore,by optimally combining the innovative achievements on these key techniques,three novel visual place recognition systems are designed and built for the typical application scenarios.Main contributions of the dissertation are summarized as follows.In the research on the key techniques for visual place description,a novel algorithm using ConvNet global feature is first proposed for the application scenario that a robot moves along a fixed trajectory.Its key idea is to introduce ConvNet features into the traditional algorithm framework that uses global features for visual place description,in order to leverage the outstanding invariance of ConvNet features to improve the environmental robustness.Besides,the running speed of this algorithm is largely improved,with an average running time of 19 ms per frame.Comparison experiment results have verified the superiority of ConvNet features over traditional hand-craft features for visual place description.Second,existing visual place description algorithms using ConvNet landmark features are evaluated for the application scenario that a robot moves along a free trajectory.Obtained results show the weaknesses in its two key steps,i.e.,landmark detection and ConvNet landmark feature extraction.Third,to overcome the found weaknesses,ConvNet landmark features are improved and then a novel visual place description algorithms using these improved features is proposed.It is made up the following four major steps:(a)binarized normed gradients features are used to quickly detect initial landmarks;(b)scene semantic information is used to select the final landmarks with high discrimination capacity;(c)a technique of multiple region of interest(MRoI)pooling is designed to exploit multi-level and multi-resolution information from multiple convolutional layers;(d)the obtained features from MRoI pooling layers are further fused in order to improve the discrimination capacity of the final ConvNet features.Comparison experiment results have demonstrated the superiority of this algorithm over the state-of-the-art,that is,not only its computational efficiency is high enough for real-time applications,,with an average running time of 53 ms per frame,but also the high discriminating power of its ConvNet features is discriminative enough to achieve state-of-the-art recognition accuracy.In the research on the key techniques for visual place remembering and matching,a novel algorithm combining tree indexing and Hash codes of ConvNet landmark features is first proposed for the application scenario that a robot moves in a known environment.In order to speed up the matching,a tree indexing-based fast nearest neighbour search technology is applied.At the same time,a coarse-to-fine matching scheme is designed to relieve the perceptual aliasing issue of tree indexing,and finally to improve the matching accuracy.Second,a novel visual place remembering and matching algorithm combining inverted indexing of Bag of ConvNet Features(BoCNF)and Hash codes is proposed for the application scenario that a robot moves in an unknown environment.Inspired by BoWbased large-scale fast retrieval,a BoCNF model is built and then an inverted indexing is constructed on the BoCNF model.In this way,the online scalability of the environment map constructed to remember the new places is effectively supported.In addition,its growth rate of the computation overhead is extremely low when the number of places stored in the environment map increases.Furthermore,the post-filtering technology of Hamming Embedding is introduced to enhance the discrimination capacity.Experimental results on benchmark datasets have demonstrated the superiority of the two algorithms over the state-of-the-art.In the research on the system design and experimental verification for visual place recognition applications,three systems are designed and built for the typical scenarios,by optimally combining the above innovative achievements of the key techniques.First,a robust visual place system(named ”GCNF”)is designed and built for the application scenario that a robot moves along a fixed trajectory in a known environment.It uses the ConvNet global feature-based place descriptor to improve place recognition accuracy.Furthermore,it is easy to achieve real-time performance in the step of place matching even using linear nearest neighbour search because this global descriptor is just one vector.Experimental results on benchmark datasets have demonstrated that the superiority of this system.In addition,its practicality has been verified in 2015 robot field trail of Canada.Second,a fast visual place recognition system(named ”TreeHashMCNF”)is designed and built for the application scenario that a robot moves along a free trajectory in a known environment.It uses the improved ConvNet landmark features to boost the robustness and real-time performance in the description process.Meantime,it leverages the coarseto-fine matching scheme,which combines tree indexing and Hash codes,to speed up the matching process.Experimental results have demonstrated that this system is comparable with the state-of-the-art in terms of recognition accuracy,and achieves 116 times speed-up on a dataset of 20688 images,with an average running time is as low as 88 ms per query.Third,a scalable visual place recognition system(named ”BoMCNF”)is designed and built for the application scenario that a robot moves along a free trajectory in an unknown environment.It improves the system of TreeHashMCNF in the visual place remembering and matching process.Specifically,it uses the improved ConvNet landmark features to train a novel model,which is named ”BoMCNF”,and then constructs BoMCNF-based inverted indexing to replace tree indexing.Experimental results have demonstrated that this system is comparable with the state-of-the-art in terms of recognition accuracy,and achieves 33 times speed-up on a dataset of 4276 images,with an average running time is as low as 84 ms per query.In addition,when the number of places stored in the environment map increases its growth rate of the computation overhead is extremely low.For example,when the number of places stored in the environment map raises from 853 to 4276,the computation overhead only increases 2 ms per frame.Therefore,it is able to achieve real-time in large-scale applications.Most importantly,this system inherits the scalability of BoMCNF-based inverted indexing.As a result,it is applicable to the application of simultaneous localization and mapping(SLAM).
Keywords/Search Tags:Visual Place Recognition, Deep Convolutional Neural Network, ConvNet, Mobile Robot, Visual Navigation, Image Matching
PDF Full Text Request
Related items