Font Size: a A A

Research On The Spatial-temporal Coding Of LiDAR Point Clouds Based On Neural Networks

Posted on:2022-12-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:L L ZhaoFull Text:PDF
GTID:1488306764958929Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
In recent years,Light detection and ranging(LiDAR)sensors have become an indispensable device in a plethora of applications,such as autonomous driving,mobile robots,drones,to name a few.LiDAR can provide accurate three-dimensional(3D)digital representation of the surrounding scene.Such acquired LiDAR point clouds consist of a huge number of points,which are represented in terms of geometry information(i.e.,3D coordinates)and other attributes(e.g.,reflectance).It has the characteristics of sparseness,disorder,and large amount of data.For example,the well-known LiDAR Velodyne HDL64E is a 64-channel laser array device that constantly delivers high data rate,up to 2.2 millions of points per second.The transmission and storage of such a large amount of data will undoubtedly incur huge burden into the limited network bandwidth and storage equipment,which greatly hinders the development of LiDAR-related applications and the landing of point cloud related technologies.Therefore,LiDAR point cloud compression(PCC)has become an urgent problem in academia and industry,which has important research significance and application value.The research on LiDAR point cloud spatial-temporal coding mainly includes two aspects:intra-frame prediction and inter-frame prediction.The former aims to conduct the prediction based on the correlation of geometric and attribute information in the current frame,thereby removing the spatial redundancy of point clouds.The latter aims to conduct the prediction based on the correlation of geometric and attribute information between consecutive frames,thereby removing the temporal redundancy of point clouds.In this dissertation,three key technologies:intra-frame prediction,inter-frame prediction and the floating-point coding are studied and some preliminary results are obtained.These research results have been applied into the localization of SLAM in autonomous driving,which effectively reduce the cost of transmission and storage,and improve the performance of localization.Specifically,the main research contents can be summarized as follows:1.A LiDAR point cloud spatial coding structure based on semantic prior representation.There are the following three challenges while building a LiDAR point cloud coding structure that is suitable for practical applications:the existing tree-based representation methods are relatively inefficient for sparse LiDAR point clouds;the existing algorithms ignore the performance on the machine perception;how to integrate the neural network model into the PCC structure to improve the coding efficiency without introducing too much complexity.To address the mentioned-above challenges,a new framework for spatial coding of LiDAR point clouds based on semantic prior representation is proposed.First,the dimensional reduction operation(the 2D representation of point clouds)is used to reduce the computational complexity,while removing some representation redundancy.A semantic prior representation(SPR)is proposed.Through the instance segmentation of 2D range images,the semantic labels are obtained as prior information to guide the intra-frame prediction,which further improves its efficiency.The overall encoding efficiency is improved by encoding semantic prior representations,rather than encoding all 3D coordinates of point clouds.On the other hand,a machine perception mode is designed.Based on the semantic information,the points that are unfavorable for the localization of SLAM at the decoding end will be removed,thereby improving the localization performance.Meanwhile,a SPR encoder is proposed,which can retain the high precision of data that is required for practical applications.The proposed structure can significantly improve the encoding performance of LiDAR point clouds and the performance of SLAM localization in practical applications.2.A LiDAR point cloud frame prediction algorithm based on 2D flow estimation.Due to the sparse and disordered characteristics of LiDAR point clouds,the motion estimation and motion compensation in 3D space is a difficult problem,and there is still no mature solution in the industry.To solve this problem,a frame prediction model based on 2D flow estimation is proposed.For conducting motion estimation,a flow estimator is designed for sparse range image sequences.Then,the intermediate frame is synthesized based on the bidirectional range images and flow information.Afterwards,the predicted point cloud frame can be obtained by the inverse projection.3.A LiDAR point cloud frame prediction algorithm based on 3D spatial-temporal convolutions.There still exist some problems in the 2D flow estimation-based method:(1)ignore the feature distribution of objects in the scene,which leads to inefficient feature extraction by using the square convolutions;(2)heavily depend on the quality of flow estimation,which leads to the poor robustness of the algorithm;(3)the temporal relationship of LiDAR range image sequence is weaker than that of videos,which leads to inefficient flow estimation;(4)the computational complexity of the flow estimation algorithm is high,which makes it difficult to integrate the frame prediction model into the point cloud coding framework.To address these issues,a lightweight LiDAR point cloud frame prediction model based on 3D spatial-temporal convolution is proposed.First,for challenge(1),an asymmetric residual block is introduced to better extract the spatial features of objects in real scenes.Second,for challenges(2)(3)(4),3D spatial-temporal convolutions are introduced to learn the temporal features of LiDAR range image sequences and further fuse the spatial-temporal features,which improve the performance and robustness of the algorithm.4.A LiDAR point cloud spatial-temporal coding method based on bi-directional frame prediction.Existing methods usually only consider the spatial redundancy and information entropy redundancy of LiDAR point clouds,while ignoring the temporal redundancy.Meanwhile,the real-time requirement of the practical LiDAR-related application scenarios imposes the much difficulty to coding.That means that the coding algorithm is required to achieve a relatively fast running speed,while ensuring a high precision of data.Furthermore,most of PCC methods based on the 2D domain use the existing image/video encoders to code range images,while ignoring the difference of the feature distribution and pixel precision between the 2D representation of LiDAR point clouds and the color images.To solve the above-mentioned problems,firstly,the proposed lightweight LiDAR point cloud frame prediction model is further improved to realize inter-frame prediction.Secondly,a range-adaptive floating-point encoder is developed,which can perform adaptive encoding based on the distribution of input floating-point numbers.It can realize fast and accurate encoding of the related data derived from range images,and jointly remove the spatial redundancy and information entropy redundancy of data.The whole PCC system combines the previous research,and realizes the real-time removal of temporal,spatial,and information entropy redundancy of LiDAR point clouds,while further improving the localization performance of SLAM at the decoding end.
Keywords/Search Tags:Light Detection and Ranging, LiDAR, Point Cloud Compression, Spatial-Temporal Coding, Point Cloud Frame Prediction
PDF Full Text Request
Related items