基于改进YOLOv5的茶叶杂质检测算法

黄少华; 梁喜凤

doi:10.11975/j.issn.1002-6819.2022.17.036

摘要: 针对现有目标检测算法检测茶叶杂质精度低、速度慢的问题，该研究提出了一种基于改进YOLOv5的茶叶杂质检测算法。采用K-Means聚类算法对杂质真实框聚类，以获取适合茶叶杂质特征的锚框尺寸；通过在主干特征提取网络CSPDarkNet中引入前馈卷积注意力机制（Convolutional Block Attention Module，CBAM），将茶叶杂质输入特征图依次经过通道注意力模块和空间注意力模块，获得特征图通道维度和空间维度的关键特征；在颈部网络中添加空间金字塔池化（Spatial Pyramid Pooling，SPP）模块，融合并提取不同感受野的关键特征信息；将普通卷积替换成深度可分离卷积，增大小目标预测特征图的置信度损失权重，构建了轻量化的改进YOLOv5网络结构模型；分别制作了铁观音茶叶中混合有稻谷、瓜子壳、竹枝和茶梗4种杂质的数据集并进行茶叶杂质检测试验。结果表明，改进的YOLOv5比常规YOLOv5在茶叶杂质检测中具有更高的置信度分数，且定位更为准确，未出现漏检现象。改进YOLOv5的多类别平均精度（Mean Average Precision，mAP）和每秒传输帧数（Frame Per Second，FPS）达到96.05%和62帧/s，均优于主流的目标检测算法，验证了改进算法的高效性和鲁棒性。该研究成果可为提升茶叶制作过程中小目标杂质检测精度与检测速度奠定基础。

Abstract: Tea sorting has been one of the most important links in tea production. Manual sorting has been often adopted to remove the excess impurities (such as branches and grains) from the collected fresh tea in traditional processing. However, the current sorting cannot fully meet the high requirement of taste and quality in the finished tea products after collection in recent years, due to the labor-intensive and high cost. Fortunately, machine vision has been gradually applied to tea impurity sorting, particularly for fully automatic sorting in the process of tea collection. Among them, the single-stage lightweight network (represented by YOLOv5 deep learning) can perform better performance for small targets with high detection speed and accuracy. However, the conventional YOLOv5 network cannot be used to extract the characteristics of tea impurities, due to the disorderly clusters, the generally small targets, the complex types of impurities, and the similar color to the tea. Particularly, the overlapping small targets can cause an inaccurate prediction box, leading to low accuracy or miss detection of the tea impurities. It is necessary to improve the conventional YOLOv5 network to meet the requirements of tea impurity detection. In this study, an improved YOLOv5 model was proposed to detect the tea impurity with a higher accuracy and detection speed than before. The YOLOv5 was taken as the baseline network. The K-Means clustering was applied to cluster the real boxes of impurities as the anchor frame size suitable for the characteristics of tea impurities. Convolutional Block Attention Module (CBAM) was introduced into the backbone feature extraction network (CSPDarkNet). The key features were obtained using the channel and spatial dimension of feature images. A Spatial Pyramid Pooling (SPP) module was added to the neck network, in order to integrate and extract the multi-scale features of different sensory fields. The deep separable convolution was updated to reduce the number of network parameters for the higher detection speed. The confidence loss weight of the small target prediction in the feature map was improved for the higher detection accuracy of the network for the small targets. The data set was taken as the Tieguanyin tea mixed with the rice, melon seed shell, bamboo branches, and tea stems. The results show that the improved YOLOv5 presented a higher confidence score than the conventional one, where the positioning was much more accurate without missing detection. The mAP and FPS of improved YOLOv5 reached 96.05% and 62 frames/s, respectively. The higher efficiency and robustness of the improved model were achieved to compare the mainstream target detections. The findings can provide a strong reference for the detection accuracy and speed of small target impurities in the tea production process.

基于改进YOLOv5的茶叶杂质检测算法

Detecting the impurities in tea using an improved YOLOv5 model