With the continuous development of 3D perception technology and the expansion of application scenarios,point cloud object detection in indoor scenes has attracted much attention in the fields of intelligent home,robot navigation,and assistance for the disabled.The general indoor voting detection method has the problems of inaccurate voting points and loss of geometric structure information of the object.At the same time,it does not fully use indoor scene and object association features,the size-fixed clustering is challenging to adapt to indoor multi-size objects.To solve the problems of the voting-based method,this paper focuses on improving the accuracy of polling points,point cloud geometry information,scene context information and size-adaptive point cloud clustering,and the main work and contributions of this article are as follows:1.Aiming at the problem that Hough voting is disturbed by cluttered background points and the voting points are inaccurate,and then the wrong detection and omission of objects caused by single-layer features,a multi-layer feature fusion 3D object detection algorithm based on heavy voting is proposed.The method innovatively uses the attention mechanism to re-vote the rough voting points.Then it extracts features from different levels of point clouds based on valid voting points for fusion and deep multi-level features jointly reason about the detection object.The revoting strategy effectively improves the accuracy of voting,and the detection accuracy combined with multi-level fusion features is improved by 4.2% compared with the mainstream Hough voting detection method.2.Aiming at the problem of how to use the geometric structure features of point cloud objects and scene and object context information to improve detection,a 3D object detection method based on geometric backtracking and scene supervision is proposed.The method is based on the polling point to trace back to the object area,and the geometric features of the point cloud are learned to optimize the position and depth features of the original polling point.At the same time,a new cross-attention module is designed to model the association between scenes and objects,capture global scene context information,and update candidate cluster characteristics.Compared with the Hough voting method,the accuracy of the proposed method is improved by 9.4%,which effectively improves the detection effect of objects with strong semantic associations.3.Aiming at the problem that the current size-fixed clustering method cannot adapt to the variable size of indoor objects,which only cover part of the area when clustering large objects and is disturbed by redundant neighboring points when clustering small objects,a size-adaptive clustering of the 3D object detection algorithm is proposed.The method innovatively uses different voting strategies for the front and background seed points to constrain the foreground object’s voting points to gather closely in the center of the object.Then the voting shift returns to the clustering radius adapted to different size objects,and the points of the target object are entirely clustered.At the same time,the information of the associated object is learned to supplement and optimize the characteristics of the occluded object.Compared with the baseline model,the method improved by 11.3 percentage points and significantly improved the quality of the detection boxes.By integrating the proposed revoting method,fusion geometric features,scene and object modeling,and size adaptive clustering strategies on the detection framework,the proposed method performs well on the large indoor point cloud datasets Scan Net and SUNRGB-D.It effectively improves the accuracy of object positioning and the detection accuracy of objects with strong semantic associations such as tables and chairs.The proposed method especially improves the detection accuracy of objects occluded and small objects when they are densely arranged. |