基于改进YOLOv5s的复杂环境下新梅检测方法

    Detecting Xinmei fruit under complex environments using improved YOlOv5s

    • 摘要: 为解决新梅在树干树叶遮挡、果实重叠情况下难以准确检测的问题,该研究建立了新梅目标检测模型SFF-YOLOv5s。在真实果园环境下构建新梅数据集,以YOLOv5s模型作为基础网络,首先在Backbone骨干网络C3模块中引入CA(coordinate attention)注意力机制以增强模型对新梅关键特征信息的提取能力并减少模型的参数量;其次在Neck层中引入加权双向特征金字塔网络,增强模型不同特征层之间的融合能力,从而提高模型的平均精度均值;最后使用SIoU损失函数替换原模型中的CIoU损失函数提高模型的检测准确率。试验结果表明,SSF-YOLOv5s模型对新梅检测准确率为93.4%,召回率为92.9%,平均精度均值为97.7%,模型权重仅为13.6MB,单幅图像平均检测时间12.1ms,与Faster R-CNN、YOLOv3、YOLOv4、YOLOv5s、YOLOv7、YOLOv8s检测模型相比平均精度均值分别提升了3.6、6.8、13.1、0.6、0.4、0.5个百分点,能够满足果园复杂环境下对新梅进行实时检测的需求,为后续新梅采摘机器人的视觉感知环节提供了技术支持。

       

      Abstract: Xinmei is one of the European varieties with "paradise medicine fruit", because of its rich medicinal and health care and beauty functions. Today, Xinmei have been widely planted in Kashgar and Yili in Xinjiang Uygur autonomous region, China. Picking robots have also been used to harvest Xinmei in recent years. However, rapid and accurate visual detection is still challenging in the complex environment, such as overlap and occlusion. A large number of Xinmei fruits are blocked by leaves and trunks, due to the small target of Xinmei fruit and the dense branches and leaves. The key feature information of Xinmei cannot be detected for the expensive and high economic value. In addition, the more fruits overlapped, the larger the overlapping area among fruits was. In this study, a detection model was proposed for Xinmei in a complex environment using SFF-YOLOv5s. The dataset of Xinmei was constructed in the real orchard environment. The advanced YOLOv5s model was adopted as the network base. Firstly, the Coordinate Attention mechanism CA (Coordinate Attention) was introduced in the C3 module of the Backbone network, in order to extract the key feature information when the Xinmei was blocked by leaves. The number of parameters was reduced to embed on mobile devices. Secondly, the weighted bidirectional feature pyramid network was introduced into the Neck layer, in order to enhance the fusion among different feature layers of the model. The recognition of the model was promoted on the mutually occluding fruit. SIoU loss function was also used to replace the CIoU loss function in the original model, in order to accelerate the convergence speed for the high accuracy of the model. The test results showed that better performance was achieved, where the accuracy of the SFF-YOLOv5s model was 93.4%, the recall rate was 92.9%, the mean average precision (mAP) was 97.7%, the model weight was only 13.6MB, and the average detection time of a single image was 12.1ms. After the CA attention mechanism was added to the C3 module, the average accuracy of the improved model increased by 0.2 percentage points, while the number of parameters was reduced from 7.02 to 6.41M. With the weighted bidirectional feature pyramid network, the average accuracy of the model reached 97.6%, which was 0.5 percentage points higher than that of the original YOLOv5s. When SIoU was used as the loss function, the accuracy of the model was improved by 2 percentage points, compared with the original. Compared with the Faster R-CNN, YOLOv3, YOLOv4, YOLOv5s, YOLOv7, and YOLOv8s models, the average accuracy (mAP) was improved by 3.6, 6.8, 13.1, 0.6, 0.4 and 0.5 percentage points, respectively. The lowest computation and weight were achieved with the high detection speed. Therefore, the optimal performance of the SFF-YOLOv5s model can fully meet the requirements of real-time detection of Xinmei in the complex orchard environment. The finding can provide technical support to the visual perception of xinmei in picking robots.

       

    /

    返回文章
    返回