Abstract
With the increasing demand for fruit and ornamental trees, the area for nursery planting has expanded. Larger areas increase labor costs and pesticide demand; simple tools are insufficient for efficient work. The emergence of agricultural spray robot has reduced the amount of work of the relevant staff in time. To address the low recognition accuracy and target misidentification of spray robots in nurseries ,this study proposes a nursery target detection method based on an improved YOLOv5s(you only look once version 5 small). Firstly, the backbone network is improved, partial convolution (PConv) is introduced into the comprehensive convolution block (C3)to reduce the computational complexity and adding a coordinate attention mechanism at the highest dimensional feature to enhance location awareness. Secondly, the neck structure of the network model is optimized to enhance the feature extraction ability of the model. At the same time, the up-sampling process during training is optimized and bilinear interpolation is used for up-sampling operation. Finally, the original coupling detection head is replaced, and the improved light decouple detection head is used to further improve the detection accuracy. This study uses a self-constructed nursery data set, a total of 2000 pictures, including three representative objects: trees, pedestrians, cultivation pots. The image of the research objects are collected at varying distances (far, medium, and near) with pedestrians exhibiting various postures. The dataset includes a total of 5769 trees, 1688 pedestrians and 6178 pots in the data set. During the labeling process, only the trees and pots in the first row of operations are marked according to the operation requirements. Considering the safety of pedestrians, once pedestrians are detected, they must be identified, therefore, all pedestrians are labeled regardless of distance. The experimental results show that the detection ability of the model is improved differently by improving C3 module and adding coordinate attention mechanism, optimizing neck network structure and up-sampling mode, and using lightweight decoupling head. Finally, the average accuracy of mAP0.5 and mAP0.5:0.95, accuracy and recall reached 88.2%, 54.7%, 86.0% and 82.4%, respectively. The size of the improved network model is 14.1MB, the average detection speed of a single image is 19.5ms, and the average frame rate is 51.3 frames. Compared with the original YOLOv5s model algorithm, mAP0.5, mAP0.5:0.95, accuracy and recall are increased by 4.6, 5.9, 1.8 and 3.4 percentage points. Compared with the current mainstream single-stage target detection algorithms YOLOv3-tiny, YOLOv7-tiny and the latest YOLOv8s detection model, the improved network still has the highest accuracy, mAP0.5 and mAP0.5:0.95. The experimental results show that the improved model can accurately and quickly identify pedestrians, pots and trees that need to work in a complex environment. After deploying the improved model on mobile devices, the accuracy of recognizing pedestrians, obstacles such as pots, and saplings requiring work has respectively increased by 7.0, 4.8, and 15.4 percentage points compared to the original model.The research findings can provide technical support for the operational activities of spraying robots in nurseries.