青花椒田间场景分割与导航线提取方法

陈子文; 杨宇帆; 张海腾; 李聪; 蒲应俊; 杨明金

doi:10.11975/j.issn.1002-6819.202404148

摘要: 精准分割青花椒田场景并提取导航路径，使农机能够自主行进，是青花椒机械化智能作业的重要环节。该研究对重庆市江津区3个青花椒种植示范基地青花椒不同生长阶段进行田间图像采集，并进行数据增强，构建青花椒田间场景数据集。针对青花椒田间场景复杂的问题，提出一种对道路、树干、树冠、天空及背景5类场景进行语义分割的轻量化网络——Mobile-Unet，该网络以U-Net为基础语义分割网络，采用MobileNetV2对U-Net特征提取网络进行替换，并将改进的MobileNetV2 8层5次下采样结构与U-Net网络结构相匹配，使用LeakyReLU激活函数避免训练过程中神经元死亡的问题。基于青花椒田场景分割结果，提出基于道路和树干双重特征的导航线提取方法。试验表明，使用数据增强和Dice Loss损失函数可以明显提升模型预测精度；相比Fast-Unet和BiseNet轻量网络Mobile-Unet在测试集上可获得更高的分割精度，像素准确率、类别平均像素准确率和平均交并比分别为91.15%、83.34%和70.51%，相较于U-Net，识别精度略有下降，但模型复杂度得到明显改善，内存占用量下降92.17%，推理速度提升近9倍。对100张测试集图片进行导航线提取试验，与仅基于道路特征导航线提取方法相比，道路和树干双重特征提取成功率由76%上升至91%，基于道路和树干特征提取导航线的平均航偏角偏差分别为2.6°和6.7°，满足田间导航的精度要求。研究结果可为青花椒田间视觉导航算法的研究提供有效参考。

Abstract: Manual planting and picking cannot fully meet the large-scale industry of green Sichuan pepper in recent years. It is crucial to accurately segment the field scene of green Sichuan pepper, and then extract the navigation path of agricultural machinery. Field management can be advanced in the intelligence of agricultural machinery in green Sichuan pepper fields. In this study, field images were collected from three planting demonstration bases of green Sichuan pepper in Jiangjin District, Chongqing Municipality, in the various planting stages. A total of 400 images were gathered to divide the dataset and test set, according to a 3:1 ratio. The open-source annotation tool Labelme was utilized to annotate the images. A navigation dataset was constructed between the rows of green Sichuan peppers, followed by data enhancement. Given the complex scenes of a green Sichuan pepper field, a lightweight network, Mobile-Unet, was proposed for the semantic segmentation of five scene types: road, trunk, tree, sky, and background. U-Net network was taken as the base semantic segmentation framework and then MobileNetV2 as the feature extraction network. The last three layers of the original MobileNetV2 network were omitted to adapt MobileNetV2 for semantic segmentation. 8-layer 5-times downsampling structure was then aligned with the U-Net architecture. Additionally, the LeakyReLU activation function was employed in the convolutional units to avoid neuron death during training. After segmentation, a navigation line extraction was then introduced to incorporate dual characteristics of roads and tree trunks. Experimental results demonstrate that the dataset and Dice Loss as the loss function effectively enhanced the prediction accuracy of the model. Compared with the two lightweight networks, Fast-Unet and BiseNet, Mobile-Unet has achieved the higher segmentation accuracy on the test set, with a pixel accuracy of 91.15%, mean pixel accuracy of 83.34%, and mean intersection over union of 70.51%. Compared with U-Net, the recognition accuracy was slightly reduced, but the complexity of the model was significantly reduced, with a 92.17% decrease in the memory occupation, and the inference speed of nearly 10 times faster. Additionally, the tests were conducted on 100 test set images for navigation line extraction. A total success rate of 91% was achieved for the extraction. The average deviation of yaw angle was 2.6° and 6.7°, respectively, to extract the navigation line using road contour and tree trunk features. The accuracy requirements were fully met in the field navigation. The finding can offer a valuable reference to explore the visual navigation in green Sichuan pepper fields.

青花椒田间场景分割与导航线提取方法

Extracting navigation line after segmenting the field scene of green Sichuan peppers