| Different from conventional perception algorithms based on the closedworld assumption,unknown obstacle perception requires models to possess the ability to detect or segment never-seen-before obstacle after learning from a limited set of object categories.In real-world scenes,the types and sizes of obstacles,as well as the application contexts of algorithms,vary greatly.Breaking through the limitations imposed by obstacle types and application contexts and improving the generalization ability of algorithms are important challenges for unknown obstacle perception in autonomous mobile robots.In this paper,we focus on two visual elements in images:the free space and obstacles,and investigate three aspects:free space segmentation,obstacle perception,and the confusion between free spaces and obstacles.The main achievements are summarized as follows:(1)To enhance the applicability of monocular cameras in free space perception,we propose a free space perception solution based on pseudo multimodal data fusion.It generates pseudo depth information through monocular depth estimation and then performs multi-modal fusion for free space segmentation.Specifically,in terms of depth estimation,we address the issue of insufficient depth edge accuracy and errors in determining the farthest region in existing algorithms.We propose an boundary-induced and scene-aggregated network that captures the overall depth change using multi-scale global context and refines scene edge information through bottom-up boundary fusion,resulting in high-quality depth maps.For free space segmentation,we propose a trusted multi-modal fusion based free space perception method.It incorporates the characteristics of different modalities into the network architecture design and model training to avoid introducing erroneous representations in multi-modal fusion.We not only demonstrate the effectiveness of the free space perception scheme based on pseudo multi-modal data fusion through comprehensive experiments,but also verify the effectiveness of the proposed monocular depth estimation algorithm and free space segmentation algorithm through independent experiments.(2)To improve the generalization ability of the algorithm to the size and semantics of obstacles,we propose a scale-generalized unknown obstacle segmentation method.The algorithm first uses a multi-layer scene object proposal algorithm,which enhances the ability of the algorithm to capture small-sized obstacles.Then,we propose a random forest model based on dissimilarity learning to independently learn the visual differences between obstacles-ground and obstacles-bac.kground,thus detecting unknown obstacles with significant visual differences from the ground.Finally,the model employs a boundary box voting mechanism to reduce reliance on the classification ability of the model.In addition,to further achieve semantic generalization of unknown objects,we propose a unknown object perception method.The algorithm introduces the generalized objectness scores and the learning methods to learn unknown categories from known categories.Furthermore,we propose a non-interest region energy suppression to further distinguish non-objects from "objects".Finally,based on graph partitioning,we select the optimal boundary box and adaptively detect unknown objects under the condition of an unknown number of objects,improving the accuracy of unknown object detection.We demonstrate through experiments in street and water surface scenes that the proposed unknown obstacle segmentation method can perceive unknown obstacle categories of different sizes.Additionally,using the only available dataset for testing unknown object detection accuracy,we further verify that the proposed unknown object detection algorithm achieves much higher accuracy than existing methods.(3)To enhance the robustness of obstacle perception against complex texture interference,we propose an interference-resistant obstacle perception algorithm to achieve the obstacle perception on reflective ground and complex textured ground.It offline calibrates the ground-camera relationship and extracts parallax-based geometric features from multiple frames to distinguish three elements,i.e.,obstacles with height in three-dimensional space,the ground plane,and reflected objects below the ground.Additionally,we design a weight-decay voting mechanism to prevent the model from excessively focusing on obstacle parts rather than the overall object.To thoroughly validate the effectiveness of this algorithm,we propose a novel dataset of obstacles on reflective surfaces for evaluation and analysis.Experimental results demonstrate that the proposed algorithm is on par with most deep learning-based algorithms,even performs superior in most challenging scenes. |