基于弱监督学习的玉米苗期植株图像实例分割方法

赵亚楠; 邓寒冰; 刘婷; 赵露露; 赵凯; 杨景; 张羽丰

doi:10.11975/j.issn.1002-6819.2022.19.016

基于弱监督学习的玉米苗期植株图像实例分割方法

Instance segmentation method of seedling maize plant images based on weak supervised learning

摘要

摘要: 基于有监督深度学习的图像分割任务通常利用像素级标签来保证模型的训练和测试精度，但受植株复杂形态影响，保证像素级标签精度的同时，时间成本也显著提高。为降低深度模型训练成本，同时保证模型能够有较高的图像分割精度，该研究提出一种基于边界框掩膜的深度卷积神经网络（Bounding-box Mask Deep Convolutional Neural Network，BM-DCNN），在有监督深度学习模型中融入伪标签生成模块，利用伪标签代替真值标签进行网络训练。试验结果表明，伪标签与真值标签的平均交并比为81.83%，平均余弦相似度为86.14%，高于Grabcut类方法生成伪标签精度（与真值标签的平均交并比为40.49%，平均余弦相似度为61.84%）；对于玉米苗期图像（顶视图）计算了三种人工标注方式的时间成本，边界框标签为2.5 min/张，涂鸦标签为15.8 min/张，像素级标签为32.4 min/张；利用伪标签样本进行训练后，BM-DCNN模型的两种主干网络当IoU值大于0.7时（AP70），BM-DCNN模型对应的实例分割精度已经高于有监督模型。BM-DCNN模型的两种主干网络对应的平均准确率分别为67.57%和75.37%，接近相同条件下的有监督实例分割结果（分别为67.95%和78.52%），最高可达到有监督分割结果的99.44%。试验证明BM-DCNN模型可以使用低成本的弱标签实现高精度的玉米苗期植株图像实例分割，为基于图像的玉米出苗率统计以及苗期冠层覆盖度计算提供低成本解决方案及技术支持。。

Abstract: Deep learning has gradually been one of the most important technologies in the field of agriculture in recent years. However, the problems of labeling quality and cost of training samples for supervised deep learning have become the bottleneck of restricting the development of technology. In order to reduce the cost of deep model training and ensure that the model can have high image segmentation accuracy, in this study, a model named Bounding-box Mask Deep Convolutional Neural Network (BM-DCNN) was proposed to realize automatic training and segmentation for maize plant. First of all, using DJI's Genie 4-RTK drone to collect top images of maize seedlings. The flight uses an automatic take-off planned route, and the entire route covers the entire test field. Second of all, using the open source labeling tool called Labelme to label top images of maize seedlings. The top images of the original maize seedling plants need to be labeled twice. In this study, we used bounding boxes as the basic shapes for weakly supervised labels, and pixels within the bounding boxes area were marked as foreground(i.e. the possible effective pixels of a maize plant). Pixels outside the bounding boxes were marked as background. Finally, the information of bounding boxes was used to generate primary pseudo-labels on the images, and the RGB color model of the images was converted to the HSV(Hue-Saturation-Value) color model, and the full connection condition random field(DenceCRF) was used to eliminate the influence of plant shadow and the image noise on the pseudo-labels accuracy in the images. The pseudo-labels were trained on the optimized YoLact model instead of the ground truth labels. The optimized model can be used for the instance segmentation of the plants at the maize seedling stage. We designed an experiment for verification and testing of BM-DCNN. By comparing the similarity between pseudo-labels mask and ground truth, it found that the mean intersection over union (mIoU) was 81.83% and mean cosine similarity (mcos(ɑ)) was 86.14%, which was higher than the accuracy of pseudo-labels generated by Grabcut(the mIoU was 40.49% and mean cosine similarity was 61.84%). For the maize seedling image (top view), the time cost of three manual annotation methods was calculated, with bounding box labels of 2.5 min/sheet, scirbbles labels of 15.8 min/sheet, and pixel-level labels of 32.4 min/sheet. Considering that the ground truth labels had an error in the handing of maize plant details, the pseudo-labels at the accuracy can be used for deep convolutional neural network training. By comparing the accuracy of instance segmentation between BM-DCNN and fully supervised instance segmentation model, when the IoU value of the BM-DCNN was greater than 0.7(AP70), the instance segmentation accuracy corresponding to the BM-DCNN model was higher than that of the supervised model. The average accuracy of the two backbone networks of the BM-DCNN model were 67.57% and 75.37%, respectively, which were close to the supervised instance segmentation results under the same conditions (67.95% and 78.52%, respectively), and the higher average accuracy can reach 99.44% of the supervised segmentation results. Therefore, For the instance segmentation task of the maize seedling plants images(top view), the instance segmentation effect of BM-DCNN can almost achieve the segmentation effect of the supervised instance segmentation model under the same conditions. It can be seen that in the large-area operation scenario of the UAV, it was feasible to use the bounding box labels of the images to replace the ground truth labels to complete the training of deep learning model, which greatly reduced the time cost of manual labeling of the samples, and provided theoretical support for the rapid realization of the application scenarios, such as the number of plants at the seedling stage of maize and the calculation of canopy coverage.

HTML全文

参考文献(34)

施引文献

资源附件(0)