| With the rapid development of robot technology,robotic applications such as planetary exploration,search and rescue,military operations are made feasible.In the outdoor unstructured environment,the mobile robot typically has to face diverse terrain types.Certain flat and non-slippery terrain types allow the robot to traverse them at relatively high speed,but other terrain surfaces are loose,bumpy or muddy,and the robot must traverse them slowly and carefully.The terrain surface itself could be a possible hazard to the outdoor mobile robot and is referred to as a non-geometric hazard.The robot should recognize those non-geometric terrain characteristics within a reasonable time window and adjust the appropriate path,gait and motion planning strategies to cope with the terrain.Campared with other sensors,visual information most closely resembles the way in which humans perceive the environment and provides richer terrain information.The visual terrain classification method is of great theoretical and practical significance.In this thesis,we use bag of visual words(BOVW)framework,hierarchical coding vectors(HCV)and deep filter banks(DFB)to develop the feature with strong discriminability for visual terrain classification,and achieved the following innovative results:(1)We complete the model optimization and design an optimal pipeine in the existing framework.The BOVW framework has emerged as a promising approach and effective paradigm for visual terrain classification.It is used to generate a compact semantic representation with low-level descriptors.We optimize the BOVW framework and propose an optimum pipeline to produce an effective and efficient visual terrain classification system.We provide a comprehensive study of all steps in the BOVW framework and different fusion methods for visual terrain classification.Then,multiple approaches in each step and their effects are explored on the Terrain8 dataset.Finally,the feature pre-processing technique,improved BOVW framework and fusion method are used to construct an optimum pipeline for visual terrain classification.The hybrid representation(HR)developed by the optimum pipeline performs effectively and rapidly for visual terrain classification in the terrain dataset,outperforming those current methods.In the Terrain8 dataset,HR achieves average classification accuracies of 88.7% and 87.7%,respectively on SIFT and DSIFT conditions.Furthermore,it is robust to diverse noises and illumination alterations.(2)Inspired by the success of DNNs in computer vision application and enconding methods for terrain classification,we propose HCV,a novel representation based on hierarchically coding structures,for visual terrain classification.We stack multiple BOVW coding layers and one Fisher coding layer to develop the hierarchical feature learning structure.In BOVW coding layers,we extract local descriptors from a terrain image with densely sampled interest points,and encode them using soft assignment(SA).The Fisher coding layer encodes those semi-local features with Fisher vectors(FV)and aggregates them to develop a final global representation.The graphical semantic information is refined by feeding the output of one layer into the next computation layer.HCV describes the terrain images through a high-level representation of richer semantic information by using a hierarchical coding structure.The experimental results on the 21-Class Land Use(LU)and RSSCN7 image databases indicate the effectiveness of the proposed HCV.Combined with the standard FV,our method(FV+HCV)achieves superior performance compared to the state-of-the-art methods on the two databases,obtaining the average classification accuracy of 91.5% on the LU database and 86.4% on the RSSCN7 database.HCV also achieved good results on the Terrain8 dataset,demonstrating its excellent terrain classification performance.(3)We propose a novel hybrid architecture,DFB,combining multi-column stacked denoising sparse autoencoder(SDSAE)and FV to automatically learn the representative and discriminative features in a hierarchical manner for visual terrain classification.SDSAE kernels describe local patches and a robust global feature of the terrain image is built through the FV pooling layer.Model parameters are critical to the performance of visual terrain classification.With the expansion of model,the exponentially increasing parameter optimization space prevente better performance.Unlike previous hand-crafted features,we use machine learning mechanisms to optimize our proposed feature extractor so that it can learn more suitable internal features from the terrain image,boosting the final performance.Our approach achieves superior performance compared to the state-of-the-art methods,obtaining average classification accuracy of 92.7%,90.4% and 89.8% respectively on the LU,RSSCN7 and Terrain8 datasets.(4)Combine with the DFB algorithm,we densign the framework to deal with terrain videos and carry out the field experiment in three mobile robot platforms.Based on our previous research,we preliminarily explore the practicality of our algorithm.We design the framework to deal with terrain videos,applying the presented visual features to three mobile robot platforms.The visual terrain classification algorithm does not affect the normal task of the robot.Field experiments have done on three mobile machine platforms.The experimental terrain includes four types: floor,snow,asphalt and grass.DFB achieves superior performance on the three platforms,obtaining the average classification accuracy of 97.12% on the Quadruped mobile robot,99.38% on the bionic arc-leg mobile robot and 95.92% on the HUSKY unmanned ground robot.Furthermore,it also can maintain visual discrimination in high speed state(94.19%)and low-light environment(93.81%).The experimental results show that DFB is of excellent visual and stable terrain classification performance,and it simultaneously has strong robustness and good practical potential.Overall,our study focuses on the visual terrain classification methods for mobile robots.We have completed the model optimization,designed two new methods,and carried out the field application experiment.In this thesis,we establish the terrain dataset Terrain8,and design the optimum pipeline for visual terrain classification based on the experimental results of the evaluation experiment.Then,along the line of a similar idea obout deep learning network,we design the HCV to develop the high level representation for visual terrain classification.Next,we propose a hybrid architecture DFB,which employs unsupervised SDSAE and an FV pooling layer to automatically learn the abstract semantic representation for tacking the visual terrain classification task.Unlike handcrafted feature representation-based methods,our DFB use machine-learning mechanisms to optimize themselves to learn more suitable internal features from terrain data.Finally,we carry out the field application experiment on three mobile robot platforms and four terrain,demonstrating the excellent pergormance of our algorithm.Our research results can provide important information for motion control and autonomous navigation for mobile robots in complex terrain environment. |