基于改进YOLOv5s的苗圃内树苗及障碍物目标检测方法

    Method for the target detection of seedlings and obstacles in nurseries using improved YOLOv5s

    • 摘要: 为解决电动喷雾机器人在苗圃作业时对树苗、行人和栽培盆识别准确率低等问题,该研究提出一种基于改进YOLOv5s(you only look once version 5 small)的多目标检测方法。首先对骨干网络进行改进,将部分卷积(partial convolution,PConv)引入综合卷积模块(comprehensive convolution block,C3)中以减少网络模型计算量;在骨干网络最高维特征后添加坐标注意力机制(coordinate attention,CA)以提升位置感知精度;优化网络模型颈部结构并使用双线性插值进行上采样操作,增强模型特征提取能力;最后对原耦合检测头进行更换并优化检测部分的结构,使用改进的轻量化解耦头LD检测头(light decouple detection head)以进一步提升检测精度。试验结果表明,改进后模型的平均精度均值mAP0.5、mAP0.5:0.95、精确率、召回率分别达到88.2%、54.7%、86.0%和82.4%。与YOLOv5s模型相比,mAP0.5、mAP0.5:0.95、精确率、召回率分别提高4.6、5.9、1.8、3.4个百分点。改进模型在移动端部署后,识别需要避障的行人、盆以及需要作业的树苗的准确率较原模型分别提升了7.0、4.8和15.4个百分点。研究结果可为电动喷雾机器人在苗圃中的作业提供技术支撑。

       

      Abstract: Nursery planting is ever expanding at present, particularly with the increasing demand for fruit and ornamental trees. Simple tools cannot fully meet the efficient work of large-scale production, due to the labor intensity and pesticide demand. Alternatively, spray robots can be expected to reduce the large number of tasks in modern agriculture. However, it is still lacking in the recognition accuracy and target identification of spray robots in nurseries. In this study, a nursery target detection was proposed using an improved YOLOv5s. Firstly, partial convolution (PConv) was introduced into the comprehensive convolution block (C3) to reduce the computational complexity, in order to improve the backbone network. A coordinate attention mechanism was added at the highest dimensional feature to enhance location awareness. Secondly, the neck structure of the network model was optimized to enhance the feature extraction of the model. At the same time, the bilinear interpolation was used for up-sampling operations during training. Finally, the original coupling detection head was replaced with the improved light decouple one, in order to further improve the detection accuracy. A nursery dataset was constructed with a total of 2 000 pictures, including three representative objects: trees, pedestrians, and cultivation pots. The images of the objects were collected from pedestrians with various postures at varying distances (far, medium, and near). The dataset included a total of 5 769 trees, 1688 pedestrians, and 6 178 pots. Only the trees and pots were marked in the first row during labeling, according to the operation requirements. Once the pedestrians are detected, they must be identified to consider the safety of pedestrians. Therefore, all pedestrians were labeled regardless of distance. The experimental results show that the detection of the model was improved differently via the C3 module and coordinate attention mechanism, as well as the optimized neck network structure and up-sampling mode using a lightweight decoupling head. Finally, the average mAP0.5, mAP0.5:0.95, accuracy, and recall reached 88.2%, 54.7%, 86.0%, and 82.4%, respectively. The size of the improved network model was 14.1 MB, the average detection speed of a single image was 19.5 ms, and the average frame rate was 51.3 frames. The mAP0.5, mAP0.5:0.95, accuracy, and recall increased by 4.6, 5.9, 1.8 and 3.4 percentage points, respectively, compared with the original YOLOv5s. The improved network had the highest accuracy, mAP0.5 and mAP0.5:0.95, compared with the current mainstream single-stage target detection YOLOv3-tiny, YOLOv7-tiny, and the latest YOLOv8s model. The experimental results show that the improved model performed better in accurately and rapidly identifying the pedestrians, pots, and trees in a complex environment. The research findings can also provide technical support to the operational activities of electric spraying robots in nurseries.

       

    /

    返回文章
    返回