Methods of green Sichuan pepper field scene segmentation and navigation line extraction
-
-
Abstract
The traditional green Sichuan pepper industry heavily relies on manual labor for planting, field management, and picking. Nevertheless, the increasing aging of the rural population and the consistent decrease in the young workforce have led to a yearly surge in labor costs. Accurate segmentation of the green Sichuan pepper field scene and extraction of the navigation path are crucial steps in enabling agricultural machinery to operate autonomously, thereby advancing the intelligence of agricultural machinery in green Sichuan pepper fields. In this paper, field images were collected from three green Sichuan pepper planting demonstration bases in Jiangjin District, Chongqing Municipality, spanning various stages of green Sichuan pepper planting. A total of 400 images were gathered, with the dataset and test set divided according to a 3:1 ratio. Utilizing the open-source annotation tool Labelme, the images were annotated to construct a navigation dataset between the rows of green Sichuan peppers, followed by data enhancement. Given the complexity of green Sichuan pepper field scenes, a lightweight network, Mobile-Unet, is proposed for the semantic segmentation of five scene types: road, trunk, tree, sky and background. This network takes U-Net as its base semantic segmentation framework and incorporates MobileNetV2 as the feature extraction network. To adapt MobileNetV2 for semantic segmentation, the last three layers of the original MobileNetV2 network were omitted, and its 8-layer 5-times downsampling structure was aligned with the U-Net architecture. Additionally, the LeakyReLU activation function was employed in the convolutional units to circumvent the issue of neuron death during training. Based on the segmentation results of the green Sichuan pepper field scene and its distinctive features, this paper introduces a navigation line extraction approach that incorporates dual characteristics of roads and tree trunks. Experimental results demonstrate that enhancing the dataset and utilizing Dice Loss as the loss function effectively enhanced the model's prediction accuracy. Compared to the two lightweight networks, Fast-Unet and BiseNet, Mobile-Unet could achieve higher segmentation accuracy on the test set, with pixel accuracy (PA) of 91.15%, mean pixel accuracy (MPA) of 83.34%, and mean intersection over union (MIoU) of 70.51%. Compared to U-Net, the recognition accuracy is slightly reduced, but the model's complexity is significantly reduced, with a 92.17 percentage point decrease in memory occupation, and the inference speed is nearly 9 times faster. Additionally, tests conducted on 100 test set images for navigation line extraction, it was observed that the success rate of dual-feature extraction incorporating both road and trunk features increased from 76% to 91%, as compared to the method solely relying on road features. The average yaw angle deviation of the navigation line extracted using road contour and tree trunk features was 2.6° and 6.7°, respectively, satisfying the accuracy requirements for field navigation. The research content of this paper offers a valuable reference for exploring visual navigation algorithms in green Sichuan pepper fields.
-
-