Font Size: a A A

Indoor Robot Scene Perception Based On Cross-modal Matching

Posted on:2021-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:M H XuFull Text:PDF
GTID:2428330611483406Subject:Power system and its automation
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of the Internet,people's living standards have been gradually improved,mobile robots have gradually entered people's vision,and multimedia data has also presented explosive development.As an important part of people's life,indoor robot can realize better navigation,positioning and other functions with good scene perception.Indoor robot scene perception is acquired through data collection and processing by sensors.This data includes text,images,point clouds,sounds,and more.Because of the huge amount of data,single modal data can no longer meet People's Daily needs.However,the traditional subspace learning method is inefficient in such data set cross-modal matching and consumes a large amount of storage space.With the development of the research,the hashing algorithm proposed by scholars can improve the matching efficiency in other cross-modal fields and greatly reduce the required storage space.Indoor robots mainly rely on lidar and camera to obtain the position of surrounding objects.Due to insufficient light or night conditions,the image collected by the camera will be partially lost,while the point cloud data collected by lidar will not be interfered with.In view of the difference of lidar and camera adaptation environment,In this paper,a point cloud-image cross-modal matching algorithm(PDCMH)based on depth hash is proposed,can find out from the database and the corresponding test point cloud is the most similar images,the use of end-to-end approach to learning,will-point cloud image feature extraction and hash code learning network integration in a framework,get public hamming space,and then by calculating hamming distance of an exclusive or operation,so as to get the most similar images with the test point cloud.In the public dataset KITTI compares the deep hashing algorithm with the subspace learning method.Finally,experiments were conducted to compare the effects of CNN-F,VGG19 and Resnet-101 image depth networks on the depth hash algorithm.Experiments show that the deep hashing algorithm proposed in this paper can realize the point cloud-image cross-modal matching task,and is better than the subspace learning method in accuracy.By studying the effect of deep network on the performance of deep hashing algorithm,the matching results of CCN-F network are better than VGG-19 and Resnet-101.On this basis,further improvements can be made in the future by reducing the quantization error,expanding the range of data sets and adopting better depth network,so as to enable indoor mobile robots to have better scene perception.
Keywords/Search Tags:cross-modal matching, the hash algorithm, indoor robot, point cloud, image
PDF Full Text Request
Related items