融合坐标注意力机制的轻量级玉米花丝检测

    Lightweight corn silk detection network incorporating with coordinate attention mechanism

    • 摘要: 玉米花丝性状是玉米生长状态的重要表征,也是决定玉米果穗生长进而影响玉米产量的重要因素。为了提升无人巡检机器人视觉系统对玉米花丝的检测精度和速度,该研究提出一种融合坐标注意力机制的轻量级目标检测网络YOLOX-CA。将坐标注意力机制(coordinate attention, CA)模块嵌入到YOLOX-s主干特征网络(Backbone)部分,以加强对关键特征的提取,提升检测精度;在颈部特征加强网络(Neck)部分,将特征金字塔结构中的普通卷积,更改为深度可分离卷积,在降低网络参数量的同时保证精度不丢失;在预测头(Head)部分引入GIoU(generalized intersection over union)改进定位损失函数计算,得到更为精准的预测结果。基于自建玉米花丝数据集训练和测试网络,试验结果表明,YOLOX-CA网络平均检测精确度达到97.69%,参数量低至8.35 M。在同一试验平台下,相较于YOLOX-s、YOLOv3、YOLOv4等目前主流的目标检测网络,平均检测精确度分别提升了2.21、3.22和0.64个百分点;相较于YOLOv3、YOLOv4,每帧推理时间分别缩短4和8 ms。该网络针对玉米花丝的检测效果较好,其轻量结构适于部署在无人巡检机器人的视觉系统上,可为玉米生长状态监测提供参考。

       

      Abstract: Corn is one of the main grains in China. The yield of corn can pose a direct impact on national food security and agricultural development. Among them, corn silk is the pollination organ of the maize ear in the growth process of corn. The important representation character of the growth state of corn can be used to determine an important factor during the growth of maize ear and the influencing yield of corn. At present, the combination of artificial intelligence and ecological agriculture has been one of the research hotspots in the field of smart agriculture. In the process of corn growth management, field inspection robots can be used to monitor the growth of crops for better yield and quality of crops. Early warning can be given to detect some diseases, pests, and weeds. The main technical difficulty lies in crop target detection from a moving perspective. This study aims to improve the accuracy and speed of corn filamentous detection in the vision system of unmanned inspection robots. A lightweight object detection network YOLOX-CA was proposed to integrate the coordinate attention mechanism. Three improvement strategies were also designed using YOLOX-s, according to the characteristics of corn silk. Firstly, three modules of coordinate attention mechanism were embedded in the trunk feature extraction network to improve the model detection accuracy, in order to dynamically enhance the extraction of features (corn silks) for less interference of background noise (corn leaves, nodal roots, and soil). Secondly, the reasoning function was deployed to realize the lightweight of the network on the vision system of the unmanned inspection robot in the corn field. Furthermore, the ordinary convolution in the feature pyramid was changed to the depth-separable convolution in the feature-strengthening network part of the model, in order to reduce the number of network parameters with high accuracy. Finally, the positioning loss function was calculated to improve the prediction part of the model. The optimization goal of GIoU was introduced to speed up the model fitting the position relationship between the prediction box and the real box, and then to improve the accuracy of the corn silk detection regression box. The image dataset of corn silks was constructed to cover the various complex scenes, such as no occlusion, slight occlusion, severe occlusion, complete filaments sprouting, and filaments adhesion. The image annotation software LabelImg was used to label the real box. The image data was divided into the training set and test set, according to the proportion of 9:1. All models were trained for 260 rounds, including 50 rounds of frozen training and 210 rounds of unfrozen training. To verify the optimization effect of the coordinate attention mechanism module on YOLOX, the current mainstream SE, and CBAM attention mechanism modules were selected to be added to the YOLOX for the ablation test. The experimental results show that the YOLOX-S with the attention mechanism was significantly improved in all parameters. Although the accuracy value of YOLOX+CA with the coordinate attention mechanism module was 0.77 percentage points lower than that of the YOLOX+CBAM network, the recall rate, F1-score, and average precision(AP)value were all improved by 1.53, 1.00, and 0.64 percentage points, respectively. Furthermore, the precision value decreased by 0.04 percentage points, whereas, the recall rate, F1-score, and AP value increased by 0.74, 1.00, and 0.49 percentage points, respectively, compared with the YOLOX+SE. The experimental results show that the attention mechanism was effective for the YOLOX network to detect the corn silks. Among them, the integrated coordinate attention mechanism of YOLOX performed the best in the detection of the corn silks under comprehensive evaluation. The improved YOLOX-CA model was compared with the current mainstream target detection networks, such as YOLOX-s, YOLOv3, and YOLOv4. The test results show that the detection accuracy of YOLOX-CA was improved by 2.21, 3.22, and 0.64 percentage points, respectively, reaching 97.69%. The parameters decreased by 0.62, 53.6, 54.65 M, respectively, and the YOLOX+CA parameters decreased to 8.35 M. The inference time of YOLOX-CA per frame was 17 ms, which was 4 and 8 ms shorter than YOLOv3 and YOLOv4, respectively. In summary, the YOLOX-CA presented extremely high target detection accuracy and speed, compared with the advanced target detection models. Therefore, the improved model can be expected to achieve high detection accuracy and better performance in the complex picking environment. Meanwhile, the lightweight network structure can be easily deployed in the automatic inspection robot vision system. The finding can also provide strong support for the real-time, automatic, and effective monitoring of corn growth state.

       

    /

    返回文章
    返回