基于改进YOLOv8n的香梨目标检测方法

    Improved YOLOv8n object detection of fragrant pears

    • 摘要: 针对非结构化环境下香梨识别准确率低,检测速度慢的问题,该研究提出了一种基于改进YOLOv8n的香梨目标检测方法。使用Min-Max归一化方法,对YOLOv3-tiny、YOLOv5n、YOLO6n、YOLOv7-tiny和YOLOv8n评估选优;以YOLOv8n为基线,进行以下改进:1)使用简化的残差与卷积模块优化部分C2f(CSP bottleneck with 2 convolutions)进行特征融合。2)利用simSPPF(simple spatial pyramid pooling fast)对SPPF(spatial pyramid pooling fast)进行优化。3)引入了PConv(partial convolution)卷积,并提出权重参数共享以实现检测头的轻量化。4)使用Inner-CIoU(inner complete intersection over union)优化预测框的损失计算。在自建的香梨数据集上,指标F0.5分数(F0.5-score)和平均精度均值(mean average precision, mAP)比原模型分别提升0.4和0.5个百分点,达到94.7%和88.3%。在GPU和CPU设备上,检测速度分别提升了34.0%和24.4%,达到了99.4和15.3帧/s。该模型具有较高的识别准确率和检测速度,为香梨自动化采摘提供了一种精确的实时检测方法。

       

      Abstract: Fragrant pears have been one of the most favorite types in pear industry. However, manual picking cannot fully meet the demands of large-scale production, due to the labor intensity. Mechanical picking has the potential to significantly reduce the labor demands for the high productivity. Among them, the visual recognition has been one of the most key elements among the various picking robots. The complex orchard environment with the variable lighting, weather conditions, foliage obstructions and fruit overlap has also posed the significant challenges on both conventional image processing and machine learning-based object detection, thereby impeding the accurate identification of fragrant pears. This study aims to improve the accuracy and detection speeds in such unstructured settings. An optimized YOLOv8n-based object detection was introduced to specifically identify the fragrant pear. A dataset was also comprised of 4 500 images of fragrant pears. A variety of conditions were considered using image collection and data augmentation, such as different lighting and foliage obstructions within orchard environments. The Min-Max normalization was employed to consider the fragrant pear harvesting. A comprehensive evaluation was conducted on five lightweight YOLO versions: YOLOv3-tiny, YOLOv5n, YOLO6n, YOLOv7-tiny, and YOLOv8n. Different weights were assigned to optimize the model selection, in terms of the mAP, precision, recall, inference time, parameters and model size. The network structure was streamlined for the object detection of fragrant pear with YOLOv8n as the baseline. Four key areas were focused to optimize the model. Initially, the network backbone was redesigned to promote the network inference speed. The impact of the C2f structure on speed was assessed to replace the certain redundant C2f components. Subsequently, an effective PConv module was integrated into the detection head of network, in order to better recognize the obscured images. A weight-sharing strategy was also coupled with the detection head parameters to reduce the parameters. Additionally, the combination of simSPPF and Inner IoU was further augmented the inference speed and bounding box regression performance. Comparative trials were then conducted on the object detection of fragrant pear. The results revealed that the superior comprehensive performance was achieved in the YOLOv8n. A higher weighted score of 83.4% was obtained after the Min-Max normalization, compared with the YOLOv3-tiny, YOLOv5n, YOLOv6n, and YOLOv7-tiny. Thus, the baseline network was selected for the subsequent research. Optimization experiments were carried out with C2f structures. The replacement of C2f was identified in the 15th and 2nd layers of the network with Conv and Conv_Res modules, in order to enhance the computational efficiency. The optimization scheme was determined for the backbone network. Ablation experiments were conducted to verify the efficacy of the various improved modules. The refined YOLOv8n was achieved the superior accuracy and detection speed on the fragrant pear dataset, with a 0.4 and 0.5 percentage point increase in the F0.5 score and mean average precision, respectively, compared with the original YOLOv8n model. Detection speeds on GPU and CPU devices increased by 34.0% and 24.4%, respectively, indicating the rates of 99.4 and 15.3 frames per second, respectively. This high-precision and rapid detection can provide the valuable technical support to the real-time detection of fragrant pears in natural orchard environments. The improved model can also be expected to deploy into the fragrant pear picking robots.

       

    /

    返回文章
    返回