孙俊,贾忆琳,吴兆祺,等. 基于改进YOLOv7的棉田虫害检测[J]. 农业工程学报,2024,40(10):176-184. DOI: 10.11975/j.issn.1002-6819.202401211
    引用本文: 孙俊,贾忆琳,吴兆祺,等. 基于改进YOLOv7的棉田虫害检测[J]. 农业工程学报,2024,40(10):176-184. DOI: 10.11975/j.issn.1002-6819.202401211
    SUN Jun, JIA Yilin, WU Zhaoqi, et al. Detecting pests in cotton fields using improved YOLOv7[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2024, 40(10): 176-184. DOI: 10.11975/j.issn.1002-6819.202401211
    Citation: SUN Jun, JIA Yilin, WU Zhaoqi, et al. Detecting pests in cotton fields using improved YOLOv7[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2024, 40(10): 176-184. DOI: 10.11975/j.issn.1002-6819.202401211

    基于改进YOLOv7的棉田虫害检测

    Detecting pests in cotton fields using improved YOLOv7

    • 摘要: 棉田虫害的快速检测与准确识别是预防棉田虫害、提高棉花品质的重要前提。针对真实棉田环境下昆虫相似度高、背景干扰严重的问题,该研究提出一种ECSF-YOLOv7棉田虫害检测模型。首先,采用EfficientFormerV2作为特征提取网络,以加强网络的特征提取能力并减少模型参数量;同时,将卷积注意力模块(convolution block attention module,CBAM)嵌入到模型的主干输出端,以增强模型对小目标的特征提取能力并削弱背景干扰;其次,使用GSConv卷积搭建Slim-Neck颈部网络结构,在减少模型参数量的同时保持模型的识别精度;最后,采用Focal-EIOU(focal and efficient IOU loss,Focal-EIOU)作为边界框回归损失函数,加速网络收敛并提高模型的检测准确率。结果表明,改进的ECSF-YOLOv7模型在棉田虫害测试集上的平均精度均值(mean average precision,mAP)为95.71%,检测速度为69.47帧/s。与主流的目标检测模型YOLOv7、SSD、YOLOv5l和YOLOX-m相比,ECSF-YOLOv7模型的mAP分别高出1.43、9.08、1.94、1.52个百分点,并且改进模型具有参数量更小、检测速度更快的优势,可为棉田虫害快速准确检测提供技术支持。

       

      Abstract: Cotton is one of the largest producing and consuming crops in China. Accurate detection of cotton pests is an important premise for improving the cotton quality. In this study, an ECSF-YOLOv7 pest detection model was proposed to address the high insect similarity and serious background interference in the natural environments of cotton fields. Firstly, EfficientFormerV2 was used as the feature extraction network, in order to strengthen the feature extraction of the network with the smaller number of parameters of the model. At the same time, the convolutional block attention module (CBAM) was embedded in the backbone output of the model, in order to enhance the extraction of small targets and weaken background interference; Secondly, GSConv was used to build a Slim-Neck network structure, which was reduced the number of model parameters while maintaining the recognition accuracy. Finally, Focal EIOU loss was used as the bounding box regression loss function to accelerate the network convergence for high detection accuracy. The dataset was selected as 17 types of insect images in cotton fields. The python scripts were used to enhance the annotated images, including random brightness, random flipping, mirror transformation, and Gaussian noise. The robustness of the model was also improved to build more insect recognition scenes in natural environments. Finally, a total of 6 273 images of the cotton field insect dataset were obtained with sufficient sample quantity and relatively balanced distribution. Four experiments were conducted to verify the excellent performance of the improved model, including ablation experiments, gradient-weighted class activation mapping (Grad-CAM) of attention mechanism, loss function, and mainstream model performance. Ablation experiments showed that the improved modules had a positive effect. The feature extraction of the image also varied, when CBAM was embedded in the different positions of the model. The Grad-CAM was used to generate a heat map of object detection. The region of interest of the heat map was closer to the real pest area, and less affected by background interference when the CBAM was embedded in the backbone output of the model. Five bounding box loss functions were compared: DIOU, EIOU, MPDIOU, CIOU, and Focal EIOU. Since the Focal loss function was combined to automatically adjust the loss weights of different types of samples, the Focal-EIOU bounding box loss function achieved the best overall performance and the highest detection accuracy. The results showed that the mean average precision (mAP) of the ECSF-YOLOv7 model was 95.71%, which was 1.43, 9.08, 1.94, and 1.52 percentage points higher than the mainstream object models YOLOv7, SSD, YOLOv5l, and YOLOX, respectively. The improved model was only 20.82 M in the number of model parameters, which was reduced by 44.15, 12.26, 55.25, and 17.9 percentage points, respectively. The ECSF-YOLOv7 model had an average detection speed of 69.47 frames per second, which was 5.26 frames higher than the YOLOv7 model. The high detection accuracy was also obtained in the situations of insect overlap, high similarity between species, small targets, and background interference. In summary, the ECSF-YOLOv7 model can be expected with high detection accuracy, fast detection speed, and smaller parameter quantity. The finding can provide technical support for the rapid and accurate detection of cotton field pests.

       

    /

    返回文章
    返回