基于改进YOLOv7-seg的黄花菜检测与分割方法

姚涛; 谈志鹏; 程娥; 吴利刚

doi:10.11975/j.issn.1002-6819.202312097

摘要: 目标检测与分割是实现黄花菜智能化采摘的关键技术，原始目标检测算法容易出现漏检、误检等问题，无法满足自然环境下生长的黄花菜采摘要求。该研究提出一种基于改进YOLOv7-seg的黄花菜目标检测与实例分割算法模型（YOLO-Daylily）。通过在YOLOv7-seg骨干网络（backbone）中引入CBAM（convolutional block attention module）注意力机制模块，降低背景等干扰因素的影响；在ELAN（efficient layer aggregation networks）模块中采用PConv（partial convolution）替换原有的3×3卷积层，减少冗余计算和内存访问，提升对目标黄花菜特征提取的能力。颈部网络（neck）采用坐标卷积（CoordConv）替换PA-FPN（path aggregation-feature pyramid networks）中1×1卷积层，增强模型对位置的感知，提高掩膜（mask）鲁棒性。在改进的PA-FPN结构中采用残差连接方法将浅层特征图几何信息与深层特征图语义信息特征相结合，提高模型对目标黄花菜的检测分割性能。消融试验表明：改进后的模型检测准确率、召回率和平均精度分别达到92%、86.5%、93%，相比YOLOv7-seg基线模型分别提升2.5、2.3、2.7个百分点；分割准确率、召回率和平均精度分别达到92%、86.7%、93.5%，比基线模型分别提升0.2、3.5、3个百分点。与Mask R-CNN、SOLOv2、YOLOV5-seg、YOLOv5x-seg算法相比，平均精度分别提升8.4、12.7、4.8、5.4个百分点。改进后的模型减少了漏检、误检等情况，对目标定位更加精准，为后续黄花菜智能化采摘实际应用提供理论支持。

Abstract: Daylily is one of the most popular perennial herbaceous plants with the high nutritional and medicinal value. Manual picking cannot fully meet the large-scale planting in recent years, due to the high labor intensity and cost while low efficiency. Alternatively, automated picking can be expected in the daylily planting under agricultural mechanization. Among them, object detection and segmentation are the key technologies for the intelligent harvesting of daylily using machine vision. However, the original object detection has been prone to produce the missed and false detections in the unstructured environments, such as the unstable lighting conditions, complex and variable backgrounds, as well as mutual occlusion of targets. It is a high demand for the accuracy of the positioning during picking. In this study, an improved YOLO-Daylily model was proposed for the object detection and instance segmentation of daylily using YOLOv7-seg model. The CBAM (Convolutional Block Attention Module) attention mechanism module was also introduced into the YOLOv7-seg backbone network, in order to reduce the influence of background and interference factors; In the ELAN (effective layer aggregation networks) module, PConv (partial convolution) was used to replace the original 3 × 3 convolutional layers, thus reducing the redundant calculations and memory access. CoordConv was used to replace the 1×1 convolutional layer in the PA-FPN (path aggregation feature pyramid networks) of the neck network, in order to enhance the perception of position and mask robustness. The residual connection was used to combine the geometric information of shallow feature maps with the semantic information of deep feature maps, for the better detection and segmentation performance of the improved model. The ablation test showed that the detection accuracy, recall rate and average precision were 92%, 86.5% and 93%, respectively, which were 2.5, 2.3, and 2.7 percentage points higher than the baseline model. The segmentation accuracy, recall rate, and average precision were 92%, 86.7% and 93.5%, respectively, which increased by 0.2, 3.5, and 3 percentage points. A comparison was made to further verify the reliability of the improved YOLO-Daylily model. The performance of segmentation was also verified with the traditional two-stage segmentation, including Mask R-CNN and single-stage, such as SOLOv2, YOLOv5l-seg, and YOLOv5x-seg. The experimental results indicate that the YOLO-Daylily model was achieved in a significantly higher average accuracy of segmentation, particularly with a rise of 8.4 percentage points, compared with the two-stage instance segmentation (Mask R-CNN). The number of floating-point operations was reduced by 50% from 258.2 to 128.3. The frame per second (FPS) segmentation speed increased by about 1.8 times. Compared with SOLOv2, YOLOv5l-seg, and YOLOv5x-seg models, the average segmentation accuracy AP₅₀ raised 12.7, 4.8, and 5.4 percentage points, respectively; The GFLOPs parameters decreased by 40.9%, 12.4%, and 51.4%, respectively. The size of the improved model decreased as well. The better performance of detection, recognition, and segmentation was obtained to reduce the missed and false detections. The improved model can meet the requirements of real-time detection. The finding can provide the theoretical support for the practical application of intelligent harvesting of daylily.

基于改进YOLOv7-seg的黄花菜检测与分割方法

Method for daylily detection and segmentation based on improved YOLOv7-seg