基于COF-YOLOv5s的油茶果识别定位

王金鹏; 何萌; 甄乾广; 周宏平

doi:10.11975/j.issn.1002-6819.202312112

摘要: 针对自然环境下油茶果存在多种类、多遮挡及小目标等问题，该研究基于YOLOv5s提出COF-YOLOv5s（camellia oleifera fruit-you only look once）油茶果识别模型，实现油茶果的高精度检测。通过添加小目标检测层、将FasterNet中的轻量化模块Faster Block嵌入到C3模块及添加Biformer注意力机制对YOLOv5s进行改进。试验结果表明，改进后网络在测试集上的精度、召回率、平均精度均值分别为97.6%、97.8%、99.1%，比YOLOv5s分别提高5.0、7.5、4.4个百分点，推理时间为10.3 ms。将模型部署到Jetson Xavier NX中，结合ZED mini相机进行油茶果识别与定位试验。室内试验得到COF-YOLOv5s的召回率为91.7%，室外绿油茶果的召回率为68.8%，小目标红油茶果在弱光条件下的召回率为64.3%。研究结果可为实现油茶产业的智能化和规模化提供理论支持。

Abstract: High-precision recognition is limited to multiple occlusions and small sizes of camellia oleifera fruits in natural environments. In this study, COF-YOLOv5s was proposed to accurately and rapidly locate the camellia oleifera fruits using YOLOv5s. Three aspects were used to improve the model. Specifically, a small target detection layer was first added. The lightweight module Faster Block from FasterNet was then embedded into the C3 module. Biformer attention mechanism was finally added. Experimental results show that only Faster-C3 to YOLOv5s increased the mAP, R and P by 1.8, 5.5 and 1.6 percentage points, respectively, compared with the original YOLOv5s,inference time decreased by 0.5 percentage points and 3.6 ms, indicating that the Faster-C3 was balanced the detection accuracy and speed. The small target detection layer significantly improved the mAP, R, and P, which increased by 1.8, 4.2, and 3.2 percentage points, respectively, compared with the original one. There was an increase in the inference time of 2.3 ms. After that, Faster-C3 was incorporated into the network with Biformer. The small target detection layer reduced both the inference time and parameter count. FasterBlock embedded into C3 mitigated the increase in the parameter count and memory access, due to the addition of the attention mechanism and small target detection layer. After all three were incorporated into the network, the mAP, R, and P increased by 4.4, 7.5, and 5.0 percentage points, respectively, compared with the original network. The highest increase was observed in R. Therefore, the network reduced the miss rate, and the inference time was only 1.8 ms longer than that of the original ones, indicating the effectiveness of this model. The improved network was achieved in P, R, and mAP of 97.6%, 97.8%, and 99.1%, respectively, on the test set, which were 5.0, 7.5, and 4.4 percentage points higher than before. The inference time was 10.3 ms, and the model weight file was only 16.1 MB. Finally, the improved model was deployed on the Jetson Xavier NX, and then combined with the ZED mini camera. The identification and positioning experiments were carried out on the camellia oleifera fruits. The recall rate of COF-YOLOv5s was 91.7% in indoor experiments, which was 47.3 percentage points higher than before. The recall rate of green camellia oleifera fruits was 68.8% in outdoor experiments. Furthermore, the recall rate was 64.3% for the small red camellia oleifera fruits under weak light conditions. The feasible theoretical support was provided to upgrade the agricultural equipment, in order to realize the intelligence and scale of the crop industry. Both indoor and outdoor experiments showed that there was some deviation in the detection on the test set. The main reason was that the camera was close to the target with a distance of about 0.2-0.4 m, resulting in the captured images being close-up shots. By contrast, the camera was about 1.2 m away from the fruit in the indoor/outdoor harvesting, which was equivalent to a long-shot picture. Identification errors then resulted in lower recall rates in indoor/outdoor experiments, compared with the test set.

基于COF-YOLOv5s的油茶果识别定位

Camellia oleifera fruit harvesting in complex environment based on COF-YOLOv5s