Font Size: a A A

Deep Learning Based Perception Technology Of Orderless Sorting And Picking

Posted on:2019-01-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:D Y ShenFull Text:PDF
GTID:1368330611493004Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the express delivery industry and the sharp increase in the amount of parcels,a large number of logistics practitioners are required to complete the sorting of express parcels.Manual sorting has been unable to meet the rapid development of the express delivery industry,and the disordered sorting system has been widely used.The disordered sorting system has high sorting efficiency and is a key factor in replacing manual sorting and improving the efficiency of logistics operations.In disordered sorting technology,the core technology is visual-based object detection technology,including estimating object type,shape size,position and pose.With the popularization of3 D sensors and the deep research of deep learning in 3D visual perception,the application of deep learning technology based on 3D data in target perception becomes possible.This thesis focuses on the main tasks of visual perception in disordered sorting scenarios,from the construction of diversity datasets in sorting scenarios,to 3D objects detection based on point cloud data.The main research contents and the innovations are as follows:1.A method for constructing large-scale systhetic datasets of sorting scene based on sparse real data is proposed.The deep learning-based object detection algorithm relies on large-scale annotated data,and the annotated datasets requires not only a sufficient number but also a rich variety.The method of manually collecting and labeling data is costly,especially the labeling of 3D data is more expensive and complicated,and the annotation datasets are difficult to cover complex environments,the sample of extreme scenes is scarce,and the adaptability of training models is poor.Therefore,this thesis proposes a method based on parallel vision to construct a large-scale systhetic datasets in a sorting scenario: on the one hand,based on sparse real-labeled data,constructing a real-labeled 'small' image datasets of the sorting scene,on the other hand,using computer graphics,graphic rendering engine and other technologies,establish artificial sorting scenes in virtual artificial systems,simulate complex actual sorting scenes,generate artificially sorted scene image 'big' datasets,and finally use the generated parallel image datasets for visual model training.This method can generate a large number of high-quality annotated data with comprehensive information in a short time,which can solve the problems of difficult data acquisition and high cost of manual labeling.In order to reduce the feature difference between the real datasets and the synthetic datasets,we used two supervised domain adaptation strategies for training,which improved the detection accuracy by more than 2%,indicating that the parallel datasets is effective.2.A two-stage 3D object detection method based on frustum point cloud data is proposed.Since the RGB image has rich color texture information,the sparse 3D point cloud data has the spatial structure information of the object,in order to effectively fuse the data of two modes of RGB image and point cloud,fully utilize the rich RGB image color texture information,we use the result of 2D object detection as a priori knowledge,on this basis,extract the object's frustum point cloud,which effectively reduces the search space range of 3D object detection.Then we use point cloud deep learning network for object instance segmentation and regress the object 3D bounding box.We propose a 3D bounding box Dense regression method based on offset residuals,and compare it with the 3D bounding box Global regression method.Experiments show that the accuracy of Dense regression method is better than the Global regression method.The proposed 3D object detection method can accurately estimate the object space position,the object size,and the object rotation posture without a precise object CAD model.As far as we know we are the first that proposing the 3D object detection method with9 degrees of freedom,the experimental results show that with a confidence of 0.7,our mAP of Dense Regression method on the easy datasets is 76.66%,and the detection time is 167 ms.3.An end-to-end 3D object detection method based on voxel point cloud is proposed.Since the accuracy of two-phase object detection based on frustum point cloud depends largely on the results of 2D object detection,and the model needs to be trained in two stages,we further study the end-to-end 3D object detection based on voxel point cloud.The object detection network model is mainly composed of a point cloud feature learning network and a 3D YOLO detection network.The point cloud feature learning network voxelized the point cloud data,and uses the voxel feature coding model to perform the point cloud in each voxel.The extraction and fusion of local features and global features obtains the entire point cloud voxel feature;the 3D YOLO detection network is based on the one-stage 2D object detection YOLO model,and the proposed 3D object detection network model directly classify object category and regress the 9-degree-of-freedom position,size and pose of the object,and the point cloud feature extraction network is used for end-to-end training.The experimental results show that the 3D object detection based on the voxel point cloud mAP is 76.74% with a confidence of 0.7,which is equal with the Dense Regression model.However,the detection speed is 1.5 times.In summary,this thesis focuses on the research of 3D visual perception in the field of express parcel sorting.Using the basic theories of 2D and 3D deep learning,this paper focuses on the collection and construction of real data sets and virtual data sets.Two kinds of 3D object detection methods based on point clouds were proposed,these key technologies and research results have important theoretical significance and application value for the visual perception of disordered picking systems and the visual recognition of other flexible intelligent robots.
Keywords/Search Tags:Orderless Sorting and Picking, Deep Learning, 3D Object Detection, Pose Estimation, Parallel Vision
PDF Full Text Request
Related items