With the rapid acceleration of industrialization,robotics is playing an increasingly important role in numerous areas of social production and daily life.In order to better meet the intelligent information needs of society,the combination of artificial intelligence technology and robotics has become a prevailing research trend.Grasping with the aid of robotic arms is the primary method by which robots accomplish many practical tasks.Therefore,endowing robots with more autonomous detection and gripping capabilities can significantly enhance their level of intelligence and flexibility,and enable them to effectively handle more complex tasks.To achieve accurate robotic arm grasping,this paper proposes a method and model by integrating deep learning,computer vision,and robotic arm grasping techniques,with the following main work:(1)To detect and recognize target objects in 2D planar images,a YOLOv5-based object detection model is proposed to reduce the influence of background information or irrelevant objects on grasping analysis and to narrow down the judgment range of grasping position and posture.This paper incorporates the Swin Transformer module and the Convolutional Block Attention Module(CBAM)into the YOLOv5 base structure,making the model more focused on key areas of the image and enhancing the global feature modeling capability.After training and testing the detection model on the Cornell Grasping Dataset,which has been supplemented with multi-object and data augmentation,the proposed method is compared with an unoptimized YOLOv5 model.The results show that the proposed method effectively improves the detection accuracy and lays the foundation for a planar grasping posture detection model.(2)In order to accurately estimate the grasping position and pose of a target object,a grasp pose detection method based on Gaussian-Wasserstein distance loss is proposed to solve the problem of boundary discontinuity caused by the periodicity of the angle parameter of the oriented grasping box and to more accurately describe the intersection status between directional grasp boxes.Firstly,the target detection model network is improved by adding a rotation angle pose processing module.Then,the shortcomings of traditional angle loss design methods based on regression and classification are analyzed,and the GaussianWasserstein distance loss calculation strategy is introduced to improve the loss function.Finally,the network training and testing are completed on Cornell and Jacquard grasping datasets.Experimental results show that the new loss design improves the grasping detection model’s pose estimation accuracy.(3)A self-contained robotic grasping system based on the "Eye-to-Hand" approach was constructed using hardware devices such as the UR5 robotic arm and the Kinect V2 depth camera.Robot object detection and grasping experiments were conducted in a real environment to verify the performance of the proposed detection method.The Zhang calibration method was used to calibrate the hand-eye system and establish the transformation relationship between various coordinate systems of the visual image.A subset of common objects was selected to simulate real-life scenarios for physical grasping experiments,which demonstrated excellent detection performance and good grasping ability,confirming the effectiveness of the proposed grasping detection method for practical tasks. |