As the world’s largest producer of tomatoes,China accounts for one-third of the global tomato production.Currently,traditional methods are commonly used for tomato cultivation in the country,with tomato picking being a time-consuming and labor-intensive process.With the rapid development of technologies such as machine vision,the development of tomato picking machines has become a hot topic.However,the variable lighting conditions inside tomato greenhouses and the occlusions between tomatoes and leaves make it challenging for existing target recognition algorithms to quickly and accurately identify the ripeness of tomatoes.Furthermore,the small size and dense arrangement of tomatoes make it difficult to locate them in space.The narrow spacing and complex greenhouse environment between tomato crops also make it challenging for tomato picking machines to achieve autonomous movement.Therefore,this paper aims to design a visual system that enables tomato picking machines to identify the ripeness of tomatoes,locate the fruit in space,achieve autonomous movement,and perform validation experiments in real tomato greenhouse environments.The main research contents and conclusions of this paper are as follows:(1)Taking cherry tomatoes as the research object,a small tomato dataset suitable for ripe red fruits was created,which can recognize the ripeness of tomatoes.An improved YOLOv4-tiny neural network model was proposed by adding a third detection head to the head network section to improve the accuracy of small object recognition.The CBAM module was fused into the backbone section to improve the recognition accuracy of occluded tomatoes.The dense connection structure was used to enhance the fusion of global feature information.The Mish activation function was used in the backbone network to ensure the accuracy of feature extraction in deep convolutions.Experimental results showed that compared with YOLOv3,YOLOv4,YOLOv4-tiny,YOLOv5m,and YOLOv51 models,the YOLOv4-tiny-X model had the highest average precision at a very high detection speed,with average precision increasing by 30.9%,0.2%,0.7%,5.4%,and 4.9%,respectively.The average precision of the improved model was 97.9%,and the recognition speed was 111 frames/s.The real-time visualization results of target recognition showed that the improved neural network model was stable in complex environments.(2)A tomato model was constructed in an indoor laboratory for tomato spatial positioning experiments.Firstly,the Intel RealSense D435i camera’s intrinsic and extrinsic parameters were obtained by calibrating the stereo camera with a calibration board.Then,stereo rectification and RGB-D image registration were performed using the camera’s parameters.Next,the YOLOv4-tiny-X model was utilized to recognize the tomato’s ROI in real-time to obtain the tomato’s center point.Finally,the tomato’s spatial position was determined by converting the tomato ROI center point coordinates to the camera coordinate system.The tomato spatial positioning experiment results showed that the proposed method can obtain the 3D coordinates of all tomatoes.Comparative experiments on measurement errors demonstrated that when the distance between the stereo camera and the tomato was within 0.2-0.5 meters,the distance error of the tomato’s 3D positioning using the stereo camera combined with the YOLOv4-tiny-X model was less than 3 millimeters.(3)Tomato harvesting vehicle mapping and navigation experiments were conducted in the Gazebo simulation environment.Firstly,the tomato harvesting vehicle model and the tomato greenhouse model were constructed.Then,the extended Kalman filter fusion method was used to perceive the surrounding environment of the tomato greenhouse for the harvesting vehicle’s localization and mapping.The experimental results demonstrated that the greenhouse map constructed by multi-sensor fusion was more accurate.Finally,by designating a target point on the tomato greenhouse grid map,the tomato harvesting vehicle could plan a reasonable path on the map using the A*algorithm and continually optimize the path during autonomous movement,realizing the functionality of autonomous navigation.(4)A system verification experiment was conducted,and the proposed algorithm model was installed on the tomato harvesting vehicle for tomato maturity recognition,positioning,and high-precision mapping and autonomous navigation in the greenhouse.The experimental results showed that the proposed method performed similarly to the simulation environment in a real tomato greenhouse,accurately recognizing the tomato’s maturity.The Intel RealSense D435i infrared stereo camera can perform spatial positioning on the tomato fruits in complex environments.The tomato harvesting vehicle can construct a highprecision indoor map and achieve autonomous navigation in a real tomato greenhouse. |