Abstract:
Navigation path recognition has been widely regarded as one of the most important sub-tasks of intelligent agricultural equipment. An intelligent tracked vehicle can also be expected to realize the automatic navigation on the road between the tea garden ridges. However, there are still some challenges on the navigation path recognition between tea ridges using deep learning models, such as the low accuracy, real-time performance, and model interpretability. In this research, a new Unet-ResNet34 model was proposed to accurately and rapidly recognize the navigation path between the tea ridges using semantic segmentation. The midpoints of the navigation path were then generated using the navigation path extracted from the model. Finally, the multi-segment cubic B-spline curve equation was used to fit the midpoints, in order to generate the navigation line between the tea garden ridges. The Image Labeler toolbox in the Matlab 2019 platform was selected to label the navigation path in the collected images for the navigation path dataset. A navigation path dataset was then obtained consisting of 1 824 images. Among them, 1 568 and 256 images in the dataset were randomly selected for the training and the validation set, respectively. Under different illumination and weed conditions, the Mean Intersection over Union (MIoU) was utilized as the accuracy indicator of the Unet-ResNet34 model, which was 91.89% for the tea road segmentation. The navigation path segmentation mask was also used to generate the navigation information and keypoints for the path fitting. Furthermore, the multi-segment cubic B-spline curve equation was selected to calculate the navigation line of the tea road between ridges using the midpoints as the control points. Additionally, the navigation line was selected to further calculate the pixel and distance error. The mean difference between the predicted pixel and distance error of tea navigation paths were 8.2 pixels and 0.022 m, respectively. As such, the width of the tea navigation path was achieved about 1 m, where the ratio was 2.2 % between the average distance error and the width of the tea navigation path. In terms of real-time performance and the number of parameters, the inference speed of the Unet-ResNet34 model was 36.8 frames per second. The number of parameters of the Unet-ResNet34 model was 26.72 M. The inference speed was 36.8 frames per second to process the RGB image with a size of 960×544. A visualization method of gradient weighted class activation mapping (Grad-CAM) was used to visually represent the final extraction feature of the improved models. More importantly, the special features were highlighted on the navigation path between the tea inter-ridges in the optimized Unet-ResNet34 structure, while retaining only the most crucial feature extractors. The speed of the tracked vehicle in the tea was mostly 0-1 m/s, particularly with the 0.179 s average processing time of a single tea inter-ridge image. In summary, the improved model can be fully realized the real-time and accurate navigation path recognition of tea ridges. The finding can also provide the technical and theoretical support to the intelligent agricultural equipment in the tea environment.