| With the rapid development of the economy,China has put forward higher requirements for industrialisation,automation and intelligent manufacturing technology.Intelligent manufacturing aims to integrate industrialisation and automation technologies with each other,which is an important direction for the future development and change of intelligent manufacturing.Machine vision technology is widely used in the field of industrial automation production due to machine vision and robotic arm gripping technology,which provides a lot of convenience for manufacturing production.Robot vision recognition technology is a critical technology for mobile robot platforms,representing the condition determination standard for robot intelligence,automation and advancedness.Detecting objects and their 6D pose(3D position and orientation)is an important task in many robot applications,including object picking operations,factory part assembly,etc.In complex environments,machine vision generally consists of two stages: the first stage is target detection,using target detection algorithm or segmentation network in RGB images to obtain the category of the target;the second stage is 6D pose estimation,which is used to estimate the 6D pose of detected objects.This paper introduces the current status of research on target detection and 6D pose estimation,and focuses on the YOLOv5 target detection algorithms in detail.A dynamic detection system based on the YOLOv5 algorithm for target detection and 6D pose estimation is designed.The main contributions of this paper are as follows:A YOLOv5-CBE target detection network based on YOLOv5 is proposed to address the problem that the YOLOv5 inaccurate localization of target objects and local features.This paper proposes a Spatial and Coordinate Attention Module that combines spatial feature information and coordinate information.The Attention Module is added to the backbone network of YOLOv5,a weighted Bidirectional Feature Pyramid Network is introduced in the neck detection layer,and the anchor parameters are optimised to improve the detection accuracy and localisation capability of the network for sample models.The improved algorithm is used to train the self-defined dataset and PASCAL VOC 2012 dataset,and the experimental analysis shows that the performance of the improved algorithm is better than the YOLOv5 algorithm,with more accurate and robust localisation of target objects without reduction in recognition accuracy.An end-to-end 6D pose estimation algorithm based on local feature representation is proposed to address the problem that the accuracy of 6D pose estimation for occluded objects is not high in current 6D pose estimation algorithms.The 3D Harris key point extraction algorithm is utilized to extract the key points with significant features in the point cloud model,and then the features at the corresponding positions of the sample model are annotated and encoded according to these key points.The local feature dataset is trained by YOLOv5 algorithm and the improved YOLOV5-CBE algorithm respectively.Experimental results show that the central point coordinate error of local feature is improved by 25%.The Singular Value Decomposition is used to calculate the 6D pose of the sample model relative to the point cloud model.The proposed algorithm achieves accurate 6D pose estimation for single and multi-target objects in complex backgrounds,and the algorithm can guarantee 2D reprojection Accuracy and ADD Accuracy above 95%with strong robustness even at the highest occlusion of 70%.In addition,the algorithm achieves a frame rate of 35 FPS on RTX3050 graphics card,which provides excellent real-time performance. |