基于优选YOLOv7模型的采摘机器人多姿态火龙果检测系统

王金鹏; 周佳良; 张跃跃; 胡皓若

doi:10.11975/j.issn.1002-6819.202208031

基于优选YOLOv7模型的采摘机器人多姿态火龙果检测系统

Multi-pose dragon fruit detection system for picking robots based on the optimal YOLOv7 model

摘要

摘要: 为了检测复杂自然环境下多种生长姿态的火龙果，该研究基于优选YOLOv7模型提出一种多姿态火龙果检测方法，构建了能区分不同姿态火龙果的视觉系统。首先比较了不同模型的检测效果，并给出不同设备的建议模型。经测试，YOLOv7系列模型优于YOLOv4、YOLOv5和YOLOX的同量级模型。适用于移动设备的YOLOv7-tiny模型的检测准确率为83.6%，召回率为79.9%，平均精度均值（mean average precision，mAP）为88.3%，正视角和侧视角火龙果的分类准确率为80.4%，推理一张图像仅需1.8 ms，与YOLOv3-tiny、YOLOv4-tiny和YOLOX-tiny相比准确率分别提高了16.8、4.3和4.8个百分点，mAP分别提高了7.3、21和3.9个百分点，与EfficientDet、SSD、Faster-RCNN和CenterNet相比mAP分别提高了8.2、5.8、4.0和42.4个百分点。然后，该研究对不同光照条件下的火龙果进行检测，结果表明在强光、弱光、人工补光条件下均保持着较高的精度。最后将基于YOLOv7-tiny的火龙果检测模型部署到Jetson Xavier NX上并针对正视角火龙果进行了验证性采摘试验，结果表明检测系统的推理分类时间占完整采摘动作总时间的比例约为22.6%，正视角火龙果采摘成功率为90%，验证了基于优选YOLOv7的火龙果多姿态检测系统的性能。

Abstract: Abstract: Dragon fruit is one of the most popular fruits in Asia. The current manual picking cannot fully meet the requirement of large-scale production in recent years, due to the labor-intensive task. Alternatively, the automated picking of dragon fruit can be expected to greatly reduce labor intensity. Among them, the vision system can be one of the most important parts of the picking robot. The commonly-used recognition cannot consider the complex growth posture of dragon fruit. The hard branches and complex postures of dragon fruit can make it difficult to achieve automatic picking. It is a high demand to distinguish the dragon fruit with the different postures, and then guide the robotic arm to approach the fruit in an appropriate path. In this study, a multi-pose detection of dragon fruit was proposed for the automatic picking using optimal YOLOv7-tiny model. 1 281 images of dragon fruit were taken in the field, including 450, 535, and 296 images under strong, weak, and artificial light conditions. The image datasets were then divided into 1 036 images for training, 116 images for validation, and 129 images for testing, according to three light levels. Among them, the light conditions were the largest influencing factor on the detection performance. A series of experiments were conducted using the dataset. Firstly, the detection performance was compared with the seven models in the YOLOv7 series. The optimal models were given for the different devices, in terms of the number of model parameters and detection performance. Secondly, the detection performance of the YOLOv7 series models was compared with the other target detection models. Finally, the YOLOv7-tiny model was deployed into the mobile device. Specifically, the depth camera was combined with the robotic arm in the field picking. The results showed that the YOLOv7-e6e model in the YOLOv7 series presented the highest precision of 85.0%, while the YOLOv7x model was the highest recall of 85.4%, and the YOLOv7 model was the highest mean average precision (mAP) of 89.3%. The YOLOv7-tiny model shared the least parameters, weight files, layers, and inference time of 6 × 106, 12MB, 255, and 1.8ms, respectively. It infers that the improved model was the most suitable for mobile devices, due to the fast inference speed. The detection precision of YOLOv7-tiny was 83.6%, the recall was 79.9%, the mAP was 88.3%, and the accuracy rate of classification for the multi-pose dragon fruits was 80.4%. Furthermore, the precision of YOLOv7-tiny increased by 16.8, 4.3, and 4.8 percentage points, respectively, whereas, the mAP increased by 7.3, 21, and 3.9 percentage points, compared with the YOLOv3-tiny, YOLOv4-tiny, and YOLOX-tiny. The precision of YOLOv7-tiny increased by 7.3, 4.2, 7.3, 6.5, 3.5, and 3.9 percentage points, respectively, compared with the YOLOv5s, YOLOXs, YOLOv4G, YOLOv4M, YOLOv5x, and YOLOXx. In addition, the mAP of YOLOv7-tiny increased by 8.2, 5.8, 4.0, and 42.4 percentage points, respectively, compared with the EfficientDet, SSD, Faster-RCNN, and CenterNet, indicating the high level of detection accuracy of YOLOv7-tiny model. The picking system of dragon fruit was constructed to verify by some picking experiments using the trained YOLOv7-tiny model. The experiment results show that the inference time of the vision system only accounted for 22.6% of the whole picking action time. The picking success rate of dragon fruits in the front view was 90%, indicating the higher performance of automatic picking than before. The conclusion can also provide technical support for fruit picking.

HTML全文

参考文献(27)

施引文献

资源附件(0)