基于改进YOLOv4模型的橙果识别与定位方法

刘洁; 李燕; 肖黎明; 李炜琪; 李浩

doi:10.11975/j.issn.1002-6819.2022.12.020

摘要: 为提高橙果采摘定位精度和作业速度，提出一种便于迁移至移动终端的改进YOLOv4模型，可从RealSense深度相机所成彩色图像中获取果实质心二维坐标，经配准提取对应深度图中质心点深度值，实现果实的三维空间定位。改进YOLOv4模型以MobileNet v2为主干网络，在颈部结构中使用深度可分离卷积替换普通卷积，实现模型轻量化并提高检测速度。训练后的改进模型对513张独立橙果测试集数据的识别平均精度达97.24%，与原始YOLOv4模型相比，平均检测时间减少11.39 ms，模型大小减少197.5 M。与经典Faster RCNN、SSD模型相比，检测平均精度分别提高了2.69和3.11个百分点，模型大小分别减少了474.5和44.1 M。与轻量化模型YOLOv4-tiny相比，召回率提升了4.22个百分点，较Ghostnet-YOLOv4，平均检测时间减少了7.15 ms。为验证该改进算法实用性，应用改进模型获取果园中78个橙果的位置信息，结果表明：果实二维识别成功率达98.72%，水平方向及垂直方向的平均绝对百分比误差均在1%以内。果实三维定位成功率达96.15%，深度信息平均绝对百分比误差为2.72%，满足采摘机械手精准定位需求。该方法为复杂场景下采摘作业实现提供了鲁棒性强、实时性好、精准度高的目标定位途径。

Abstract: Abstract: An improved YOLOv4 model was proposed to improve the speed and accuracy of orange fruit picking. A lighter structure and a smaller weight size were also designed easy to be migrated into the mobile terminal. Two-dimensional coordinates of the fruit centroid were also obtained from the color image shot by the RealSense depth camera. The depth value of the centroid point in the corresponding depth map was extracted after the registration of the depth and color map, in order to realize the three-dimensional spatial positioning of the fruit. The MobileNet v2 was taken as the backbone network in the structure of the improved YOLOv4 model. The depthwise separable convolutions were used to replace the ordinary convolutions in the neck structure, in order to further reduce the weight of the model for the higher detection speed. A comparison experiment was carried out on the detection effects between the improved YOLOv4, the original YOLOv4, YOLOv4-tiny, Ghostnet-YOLOv4, and classical convolutional neural networks like Faster RCNN and SSD, in order to verify the effectiveness and superiority of the improved model. The results showed that the value of precision, recall, F1, and average precision of the improved model were 97.57%, 92.27%, 94.85%, and 97.24%, respectively, which were close to the original YOLOv4 model at a high level. Meanwhile, the average detection time and the model size were reduced by 11.39 ms, and 197.5 M, respectively. The average precisions of the detection were improved by 2.69 percentage point and 3.11 percentage point, respectively, compared with the Faster RCNN and SSD models. Correspondingly, the model size decreased by 474.5 and 44.1 M, respectively. The recall of the proposed detection model increased by 4.22 percentage point, compared with the YOLOv4-tiny model. Additionally, the average detection time of the improved model decreased by 7.15 ms to that of Ghostnet-YOLOv4. Besides, an investigation was made to explore the influences of occlusion degree on the detection accuracy of the improved model. The average precision for the detection of the improved model in the severe occlusion test set was 95.37%, and 3.98 percentage point lower than those in the slight ones. It seemed that the severe occlusion reduced the detection accuracy. The improved model still retained high detection performance, even under the interference of occlusion. The three-dimensional spatial positioning of orange fruit was used to verify the proposed location, which combined the improved YOLOv4 model and RealSense camera. The improved YOLOv4 model was applied to recognize and locate a total of 78 fruits in actual orchard environment in order to verify the effectiveness of the proposed location algorithm. The results showed that there was a 98.72% success rate in two-dimensional fruit recognition. The mean absolute errors in the horizontal and vertical directions were 0.91 and 1.26 pixels, respectively. The mean absolute percentage errors were both within 1 percentage point. Moreover, the success rate of three-dimensional fruit positioning reached 96.15%. The mean absolute error of depth information was 3.48 cm, and the mean absolute percentage error was 2.72%. Consequently, the prediction errors in the three directions all remained in a small range, which fully met the need of accurate positioning for the picking manipulators. In conclusion, the finding can also provide a target location approach with strong robustness, excellent real-time performance, and high accuracy for the picking operation in complex scenes.

基于改进YOLOv4模型的橙果识别与定位方法

Recognition and location method of orange based on improved YOLOv4 model