基于知识蒸馏的多教师棉田杂草检测模型

朱养鑫; 郝珊珊; 郑伟健; 金诚谦; 印祥; 周鹏

doi:10.11975/j.issn.1002-6819.202411218

基于知识蒸馏的多教师棉田杂草检测模型

Multi-teacher cotton field weed detection model based on knowledge distillation

摘要

摘要: 杂草对农业生产的负面影响日益严重，高效的杂草检测方法对推动农业可持续发展至关重要。然而，现有的杂草检测模型在检测精度和效率方面仍存在提升空间。该研究基于YOLOv5，引入知识蒸馏技术，并整合多个教师模型的知识，旨在提高模型在棉田杂草检测中的精度和实时检测能力。首先，提出了一种基于温度系数的Logits软投票机制，通过动态调整教师模型的蒸馏损失权重，有效融合各教师模型的优势。同时，还提出了基于注意力机制的多教师特征融合方法，动态加权突出关键特征，抑制冗余信息。结果表明，该研究提出的YOLOv5s-CWD在模型精度和实时检测性能上均表现优异，F1分数（F1-score, F₁）为94.5%，平均精度均值（mean average precision, mAP）在验证集和测试集上分别为96.8%和93.6%，帧率（frames per second, FPS）为46.71 帧/s，浮点运算数（giga floating point operations, GFLOPs）为4.1，模型大小为2.9 MB。与YOLOv5s相比，YOLOv5s-CWD在验证集上的mAP仅降低0.9个百分点，FPS提高约57.22%，计算复杂度降低约74.38%，模型大小减少约79.86%。与YOLOv7相比，YOLOv5s-CWD在验证集上的mAP降低1.3个百分点，FPS提高约1076.57%，计算复杂度和模型大小大幅优化。与YOLOv10s相比，YOLOv5s-MGD在验证集上的mAP降低0.9个百分点，FPS提高约143.41%，计算复杂度降低约83.27%，模型大小降低约82.42%。综上所述，YOLOv5s-CWD在保证高精度的同时，显著提升了检测速度、计算效率和存储性能，适用于在性能受限设备上实时检测，为棉田杂草检测提供了有力的技术支持。

Abstract: As the negative impact of weeds on agricultural production becomes increasingly severe, efficient weed detection methods are crucial for promoting sustainable agricultural development. However, existing weed detection models still face challenges in terms of accuracy and efficiency. To address this issue, this paper, based on YOLOv5, introduced Knowledge Distillation techniques and integrated the “knowledge” from multiple-teacher models, aiming to enhance performance and real-time detection ability of the weed detection model. Firstly, a temperature coefficient-based soft voting mechanism for logits (TS_Logits) was proposed, which dynamically adjusted the distillation loss weights of the teacher models, effectively integrating the strengths of each teacher model. Meanwhile, a multi-teacher feature fusion method based on attention mechanisms (AT_Feature) was also proposed, which dynamically weighted key features and suppressed redundant information. In this paper, the open-source CottonWeedDet12 dataset was used, with data augmentation techniques applied for random expansion, ultimately obtaining 9210 weed images. These images were divided into training and validation sets in an 8:2 ratio, and a test set consisting of 554 actual weed images was also incorporated. Based on YOLOv5s, channel pruning techniques were applied to obtain a student model with a size of 2.9 MB, an F1-score (F₁) of 93.5%, and a mean Average Precision of validation set (mAP_val) of 95.9%. Meanwhile, YOLOv5s, YOLOv7 and YOLOv10s were used as teacher models, and three Logits-based knowledge distillation methods (KD, Luminet, and DKD) and ten Feature-based knowledge distillation methods (FitNet, AT, NST, PKT, RKD, VID, SemCKD, CWD, MGD, and FGD) were applied. The experimental results showed that the student model achieved the best performance when YOLOv5s was used as the teacher model, further proving that using the same model architecture facilitated knowledge transfer between the teacher and student models, thereby enhancing the distillation effect. Additionally, this study evaluated the effect of the TS_Logits and the AT_Feature on the student model performance. The TS_Logits, which combined the strengths of YOLOv5s and YOLOv7, significantly enhanced the student model performance, achieving an F₁ of 94.2% and a mAP_val of 96.4%. The AT_Feature significantly enhanced the student model performance by integrating features from the teacher models, generating richer feature maps. Specifically, MGD achieved an F₁ of 93.9% and a mAP of 96.4%; CWD achieved an F₁ of 94.2% and a mAP_val of 96.4%; and PKT achieved an F₁ of 94.3% and a mAP_val of 96.4%. After combining the TS_Logits and AT_Feature, the proposed YOLOv5s-MGD performed excellently in model accuracy and real-time detection performance. The model achieved an F₁ of 94.5%, a mAP_val of 96.8%, a mean Average Precision of test set (mAP_test) of 93.6%, a frame rate of 46.71 frames per second (FPS), a computational complexity of 4.1 giga floating point operations (GFLOPs), and a model size of 2.9 MB. Compared to YOLOv5s, YOLOv5s-MGD exhibited a decrease of only 0.9 percentage points in mAP_val, while the FPS increased by approximately 57.22%, computational complexity was reduced by about 74.38%, and the model size decreased by approximately 79.86%. Compared to YOLOv7, YOLOv5s-MGD showed a 1.3 percentage points decrease in mAP_val, but FPS increased by approximately 1076.57%, along with significant optimizations in computational complexity and model size. Compared to YOLOv10s, YOLOv5s-MGD had a 0.9 percentage points decrease in mAP_val, FPS increased by approximately 143.41%, computational complexity was reduced by about 83.27%, and the model size decreased by approximately 82.42%. In summary, YOLOv5s-MGD not only maintained high accuracy but also significantly improved detection speed, computational efficiency, and storage performance, making it well-suited for real-time deployment on resource-constrained devices. This model effectively addressed the challenges of weed detection and provided strong technical support for the development of modern agriculture.

HTML全文

参考文献(45)

施引文献

资源附件(0)