| Image-based wheat spikes identification and counting are essential for crop management,yield assessment,and phenotypic analysis.In recent years,wheat head detection has been widely studied,but these studies are often limited to a single growing environment of wheat.When aggregating the wheat varieties from all over the world,constitute a complex and comprehensive wheat dataset.The background and appearance(size,color,awn,etc.)of wheat heads vary significantly due to different varieties,growing environments,growing periods,and shooting methods.There are considerable difficulties in detecting wheat heads for this complex wheat dataset.On the one hand,the background information of wheat heads is too complicated,and it is challenging to combine the practical background information during the detection process.On the other hand,the size of wheat ears varies greatly,from a few pixels to several hundred pixels.In this paper,we propose the weighted coordinate attention mechanism and the spatial pyramid attention mechanism for the wheat ears detection with backgrounds of complex environments.A cell phone software for wheat ears detection is designed and developed.The main research contents are as follows.(1)The weighted coordinate attention mechanism was proposed for the problem of complex background context information of the wheat spikes in the image.The attention mechanism captures the long-range dependencies in the image space during feature extraction.The wheat spike can effectively combine the contextual information to improve the contextual feature extraction ability and enhance the detection effect of the model.In this paper,we design a series of comparison and generalization experiments based on the publicly available wheat datasets to verify the usefulness of this attention mechanism.The experimental results show that the accuracy and generalization of the model are enhanced by inserting the proposed attention mechanism into the backbone network.(2)The spatial pyramid attention mechanism was proposed for the problem of varying sizes of wheat spikes in images.The attention mechanism generates more weight maps,which can effectively augment the shallow features in the feature pyramid network,enhance the weight information in the object region,filter irrelevant background information,alleviate the feature redundancy problem existing in the backbone network,and enable the shallow feature maps to be augmented before they are fused with the deep feature maps.The effectiveness of this attention mechanism is verified by inserting it into a target detection algorithm with a feature pyramid network.The results on the wheat dataset show that the multi-scale wheat spikes detection capability of the model is enhanced by adding the spatial pyramid attention mechanism.(3)The wheat heads detection software was developed due to the lack of practical software.The software is based on the wheat heads detection model and combines front and back-end technologies.Select an image from the user interface.The server loads the detection model to detect the image,uses the obtained prediction information to annotate the image,and returns the annotated results to the front-end interface for display.The software is a modular architecture that can detect wheat heads and other targets such as vehicles and pedestrians based on the training fitted model.The main work of this paper is to propose solutions to the problems of object detection in complex environments,design corresponding algorithms,validate them in wheat datasets,and design practical detection software based on them,which will help subsequent research of wheat spikes in agronomy and biology. |