大视场下荔枝采摘机器人的视觉预定位方法

陈燕; 王佳盛; 曾泽钦; 邹湘军; 陈明猷

doi:10.11975/j.issn.1002-6819.2019.23.006

摘要: 机器人采摘荔枝时需要获取多个目标荔枝串的空间位置信息，以指导机器人获得最佳运动轨迹，提高效率。该文研究了大视场下荔枝采摘机器人的视觉预定位方法。首先使用双目相机采集荔枝图像；然后改进原始的YOLOv3网络，设计YOLOv3-DenseNet34荔枝串检测网络；提出同行顺序一致性约束的荔枝串配对方法；最后基于双目立体视觉的三角测量原理计算荔枝串空间坐标。试验结果表明，YOLOv3-DenseNet34网络提高了荔枝串的检测精度与检测速度；平均精度均值（mean average precision，mAP）达到0.943，平均检测速度达到22.11帧/s。基于双目立体视觉的荔枝串预定位方法在3 m的检测距离下预定位的最大绝对误差为36.602 mm，平均绝对误差为23.007 mm，平均相对误差为0.836%，满足大视场下采摘机器人的视觉预定位要求，可为其他果蔬在大视场下采摘的视觉预定位提供参考。

Abstract: Litchi picking robot is an important tool for improving the automation of litchi picking operation. The spatial position information of litchi cluster needs to be acquired when the robot picks litchi normally. In order to guide the robot moving to the picking position and improve the picking efficiency, the vision pre-positioning method of litchi picking robot under large field of view is proposed in this paper studied. Firstly, using the binocular stereo vision system composed of two industrial cameras that have been calibrated, 250 pairs of litchi cluster images under large field of view was taken in the litchi orchard in Guangzhou, the spatial positions of key litchi clusters were recorded by using a laser range finder, and the results were compared with those tested in the paper. In order to expand the sample size, the original image and the polar line correction image were randomly cropped and scaled in a small range, and the final image data set was 4 000 sheets. After that, by using labeling, the data set of the target detection network was created. Secondly, by using the YOLOv3 network and the DenseNet classification network, combined with the characteristics of single target and single scene of litchi cluster detection task (only for orchard environment), the network structure was optimized, a Dense Module with a depth of 34 layers and a litchi cluster detection network YOLOv3-DenseNet34 based on the Dense Module was designed. Thirdly, Because of the the complexity of the background image under large field of view, the dense stereo matching degree of the whole image is low and the effect is poor, at the same time, some litchi clusters can not appear in the public view of the image at the same time, therefore, a method for calculating sub-pixel parallax was designed. By solving the quadratic curve composed of parallax and similarity, the parallax under sub-pixel was used to calculate the spatial positions of the litchi cluster. Through the comparison with the original network of YOLOv3, the test network performance of the paper was tested, and found that the YOLOv3-DenseNet34 network improved the detection accuracy and detection speed of the litchi cluster, the mAP (mean average precision) value was 0.943, the average detection speed was 22.11 frame/s and the model size was 9.3 MB, which was 1/26 of the original network of YOLOv3. Then, the detection results of the method were compared with the results of the laser range finder. The max absolute error of the pre-positioning at the detection distance of 3 m was 36.602 mm, the mean absolute error was 23.007 mm, and the average relative error was 0.836%. Test results showed that the vision pre-positioning method studied in this paper can basically meet the requirements of vision pre-positioning under large field of view in precision and speed. And this method can provide reference for other vision pre-positioning methods under large field of view of fruits and vegetables picking.

大视场下荔枝采摘机器人的视觉预定位方法

Research on vision pre-positioning for litchi picking robot under large field of view