Indoor Robot Scene Perception Based On Cross-modal Matching

Posted on:2021-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:M H Xu

Full Text:PDF

GTID:2428330611483406

Subject:Power system and its automation

Abstract/Summary:

PDF Full Text Request

In recent years,with the rapid development of the Internet,people's living standards have been gradually improved,mobile robots have gradually entered people's vision,and multimedia data has also presented explosive development.As an important part of people's life,indoor robot can realize better navigation,positioning and other functions with good scene perception.Indoor robot scene perception is acquired through data collection and processing by sensors.This data includes text,images,point clouds,sounds,and more.Because of the huge amount of data,single modal data can no longer meet People's Daily needs.However,the traditional subspace learning method is inefficient in such data set cross-modal matching and consumes a large amount of storage space.With the development of the research,the hashing algorithm proposed by scholars can improve the matching efficiency in other cross-modal fields and greatly reduce the required storage space.Indoor robots mainly rely on lidar and camera to obtain the position of surrounding objects.Due to insufficient light or night conditions,the image collected by the camera will be partially lost,while the point cloud data collected by lidar will not be interfered with.In view of the difference of lidar and camera adaptation environment,In this paper,a point cloud-image cross-modal matching algorithm(PDCMH)based on depth hash is proposed,can find out from the database and the corresponding test point cloud is the most similar images,the use of end-to-end approach to learning,will-point cloud image feature extraction and hash code learning network integration in a framework,get public hamming space,and then by calculating hamming distance of an exclusive or operation,so as to get the most similar images with the test point cloud.In the public dataset KITTI compares the deep hashing algorithm with the subspace learning method.Finally,experiments were conducted to compare the effects of CNN-F,VGG19 and Resnet-101 image depth networks on the depth hash algorithm.Experiments show that the deep hashing algorithm proposed in this paper can realize the point cloud-image cross-modal matching task,and is better than the subspace learning method in accuracy.By studying the effect of deep network on the performance of deep hashing algorithm,the matching results of CCN-F network are better than VGG-19 and Resnet-101.On this basis,further improvements can be made in the future by reducing the quantization error,expanding the range of data sets and adopting better depth network,so as to enable indoor mobile robots to have better scene perception.

Keywords/Search Tags:

cross-modal matching, the hash algorithm, indoor robot, point cloud, image

PDF Full Text Request

Related items

1	Outdoor Mobile Robot Radar-image Cross-modal Retrieval Technology
2	Research On Feature Point Detection And Point Cloud Matching Algorithm For Indoor Scene Image
3	Research On Cross-modal Retrieval Based On Semantic Discriminative Hash
4	Research On Image-Text Cross-Modal Matching Based On Attention Mechanism
5	Cross-moda1 Hash Retrieval For Label Consistency Preservation
6	Research On Hierarchical Supervised Cross-modal Image And Text Retrieval Based On Deep Hashing
7	Research On Content Sifting And Storage Mechanism Of Cross-modal Image And Text Data Based On Semantic Similarity
8	Learning To Hash For Large-scale Cross-modal Retrieval
9	Research On New Methods Of Cross-Modal Retrieval Via Hash Learning
10	Research On Indoor Point Cloud Data Generation Based On Binocular Vision