采用改进Unet网络的茶园导航路径识别方法

    Navigation path recognition between tea ridges using improved Unet network

    • 摘要: 针对目前在茶园垄间导航路径识别存在准确性不高、实时性差和模型解释困难等问题,该研究在Unet模型的基础上进行优化,提出融合Unet和ResNet模型优势的Unet-ResNet34模型,并以该模型所提取的导航路径为基础,生成路径中点,通过多段三次B样条曲线法拟合中点生成茶园垄间导航线。该研究在数据增强后的茶园垄间道路训练集中完成模型训练,将训练完成的模型在验证集进行导航路径识别,根据梯度加权类激活映射法解释模型识别过程,可视化对比不同模型识别结果。Unet-ResNet34模型在不同光照和杂草条件下导航路径分割精度指标平均交并比为91.89%,能够实现茶园垄间道路像素级分割。模型处理RGB图像的推理速度为36.8 帧/s,满足导航路径分割的实时性需求。经过导航线偏差试验可知,平均像素偏差为8.2像素,平均距离偏差为0.022 m,已知茶园垄间道路平均宽度为1 m,道路平均距离偏差占比2.2%。茶园履带车行驶速度在0~1 m/s之间,单幅茶垄图像平均处理时间为0.179 s。研究结果能够为茶园视觉导航设备提供技术和理论基础。

       

      Abstract: Navigation path recognition has been widely regarded as one of the most important sub-tasks of intelligent agricultural equipment. An intelligent tracked vehicle can also be expected to realize the automatic navigation on the road between the tea garden ridges. However, there are still some challenges on the navigation path recognition between tea ridges using deep learning models, such as the low accuracy, real-time performance, and model interpretability. In this research, a new Unet-ResNet34 model was proposed to accurately and rapidly recognize the navigation path between the tea ridges using semantic segmentation. The midpoints of the navigation path were then generated using the navigation path extracted from the model. Finally, the multi-segment cubic B-spline curve equation was used to fit the midpoints, in order to generate the navigation line between the tea garden ridges. The Image Labeler toolbox in the Matlab 2019 platform was selected to label the navigation path in the collected images for the navigation path dataset. A navigation path dataset was then obtained consisting of 1 824 images. Among them, 1 568 and 256 images in the dataset were randomly selected for the training and the validation set, respectively. Under different illumination and weed conditions, the Mean Intersection over Union (MIoU) was utilized as the accuracy indicator of the Unet-ResNet34 model, which was 91.89% for the tea road segmentation. The navigation path segmentation mask was also used to generate the navigation information and keypoints for the path fitting. Furthermore, the multi-segment cubic B-spline curve equation was selected to calculate the navigation line of the tea road between ridges using the midpoints as the control points. Additionally, the navigation line was selected to further calculate the pixel and distance error. The mean difference between the predicted pixel and distance error of tea navigation paths were 8.2 pixels and 0.022 m, respectively. As such, the width of the tea navigation path was achieved about 1 m, where the ratio was 2.2 % between the average distance error and the width of the tea navigation path. In terms of real-time performance and the number of parameters, the inference speed of the Unet-ResNet34 model was 36.8 frames per second. The number of parameters of the Unet-ResNet34 model was 26.72 M. The inference speed was 36.8 frames per second to process the RGB image with a size of 960×544. A visualization method of gradient weighted class activation mapping (Grad-CAM) was used to visually represent the final extraction feature of the improved models. More importantly, the special features were highlighted on the navigation path between the tea inter-ridges in the optimized Unet-ResNet34 structure, while retaining only the most crucial feature extractors. The speed of the tracked vehicle in the tea was mostly 0-1 m/s, particularly with the 0.179 s average processing time of a single tea inter-ridge image. In summary, the improved model can be fully realized the real-time and accurate navigation path recognition of tea ridges. The finding can also provide the technical and theoretical support to the intelligent agricultural equipment in the tea environment.

       

    /

    返回文章
    返回