Abstract
Potato is one of the most typical shallow root crops. The root distribution characteristics can then dominate the effective water and nutrient management. The accurate segmentation of root system can be the essential prerequisite for the key parameters of root system structure. Taking the potato root images as the research object, this study aims to achieve the non-contact, low-cost, fast, and accurate segmentation of potato root images. A potato root image segmentation was also proposed to monitor the growth state of potato using the improved DeepLabv3+ semantic segmentation network. The root length of the output image was then calculated. The spatial and temporal dynamic distribution characteristics of potato roots were calculated in the northern foothills of Yinshan Mountain in Inner Mongolia, China. The test results show that the training time of the improved MobileNetv2 was only 10.05 h, which was 2.27 and 4.1 h less than ResNet50, and Xception, respectively. In terms of the image segmentation performance, the MIoU of MobileNetv2 reached 92.26%, which was 1.84 and 2.68 percentage points higher than that of ResNet50 and Xception, respectively. The MPA reached 94.15%, which was 1.78 and 2.69 percentage points higher than that of the ResNet50 and Xception, respectively. The MIoU and MPA of DeepLabv3+ increased by 1.43 and 1.62 percentage points after the introduction of the improved MobileNetv2 backbone network, according to the standard DeepLabv3+. The MIoU and MPA increased by 1.61 and 1.80 percentage points after the introduction of CARAFE upsampling, respectively. The MIoU and MPA were improved by 2.92 and 2.68 percentage points, respectively, after the introduction of the CBAM attention mechanism. Furthermore, the CARAFE upsampling and CBAM attention mechanisms were introduced to improve the MobileNetv2 backbone network for the combination of different modules. After that, the MIoU and MPA increased by 2.30 and 2.61 percentage points, and 3.02 and 2.82 percentage points, respectively. The MIoU and MPA of the CARAFE upsampling increased by 2.98 and 2.23 percentage points, respectively, after combining with the CBAM attention mechanism. The best three improvement strategies were selected to increase the MIoU and MPA by 4.18 and 4.28 percentage points, respectively. The MIoU and MPA of the improved DeepLabv3+ model were 94.05% and 95.72%, respectively. The MIoU increased by 6.67, 4.92, 8.80 and 4.21 percentage points, respectively, compared with the SegNet, PSPNet, U-Net and standard DeepLabv3+, and the MPA increased by 6.7, 4.86, 8.25, and 4.53 percentage points, respectively. The training time was 9.52 h, which was shortened by 6.8, 3.99, 4.56, and 3.94 h, respectively, compared with the SegNet, PSPNet, U-Net, and standard DeepLabv3+. The FLOPs were reduced by 45×109, 34×109, 29×109, and 18×109, respectively, compared with the SegNet, PSPNet, U-Net and standard DeepLabv3+. The frame rate of image detection increased by 15.3, 11.7, 11.4, and 9 fps, respectively. The coefficient of determination reached 0.981 in the regression analysis with the manually measured root length. The 80% of potato roots were distributed in the soil layers of 0-20, 0-30, 0-40, and 0-30 cm, respectively, during the seedling, tuber formation, tuber bulking, and starch accumulation stage. The finding can provide a theoretical basis for the high-yield and high-efficiency cultivation techniques of potato in the northern Yinshan Mountain in Inner Mongolia of China.