反卷积引导的番茄叶部病害识别及病斑分割模型

任守纲; 贾馥玮; 顾兴健; 袁培森; 薛卫; 徐焕良

doi:10.11975/j.issn.1002-6819.2020.12.023

摘要: 针对当前植物叶部病害识别模型易受阴影、遮挡物及光线强度干扰，特征提取具有盲目和不确定性的问题，该研究构建一种基于反卷积引导的VGG网络（Deconvolution-Guided VGGNet，DGVGGNet）模型，同时实现植物叶部病害种类识别与病斑分割。首先使用VGGNet计算多分类交叉熵损失进行病害分类训练，得到病害分类结果；其次设计反向全连接层，将分类结果恢复为特征图形式；然后采用上采样与卷积操作相结合的方法实现反卷积，利用跳跃连接融合多种特征恢复图像细节；最后使用少量病斑监督，对每个像素点使用二分类交叉熵损失进行训练，引导编码器关注真实的病斑部位。试验结果表明，该研究模型的病害种类识别精度达99.19%，病斑分割的像素准确率和平均交并比分别达94.66%和75.36%，在遮挡、弱光等环境下具有良好的鲁棒性。

Abstract: Deep learning has been widely applied to the recognition and segmentation of plant leaf disease. However, traditional recognition models of plant leaf disease usually lack transparency, since these end-to-end deep classifiers are susceptible to shadow, occlusion, and light intensity. To address this drawback, in this paper, we proposed a new recognition and segmentation model of plant leaf disease based on deconvolution-guided, called Deconvolution-Guided VGGNet (DGVGGNet). An encoder-decoder architecture with symmetric convolutional-deconvolutional layers was applied to DGVGGNet so that the plant leaf disease recognition and segmentation can be carried simultaneously. Our model consists of three main phases: recognition, inversion, and deconvolution. In the recognition phase, we first fed the VGGNet with plenty of plant leaf disease images, and then Categorical Cross-Entropy Loss was utilized to train the recognition model. VGGNet was made up of the convolution layers of VGG16 and 2 fully connected layers, and the weights of convolution layers were pre-trained on ImageNet. Ten kinds of tomato leaf disease images in the PlantVillage dataset were used in this paper, concretely, 30% of the pictures were used for training, and the rest 70% were used for the test. Besides, on-the-fly data augmentation was also exploited during the training stage, such as flipping the images and corrupting the origin images by brightness, saturation, salt noise, and Gaussian noise. In the inversion phase, two fully connected layers were used to reverse the category prediction vector. Simultaneously, two skip connections were used to reinforce the decoder by adding the vector of the VGGNet’s fully connected layers. Then, the feature vector was reshaped to feature maps to input into the deconvolution module. In the deconvolution phase, the feature maps were fed into the deconvolution module to acquire the segmentation result of the disease area, where each pixel was trained with the Binary Cross-Entropy Loss. The deconvolution module consists of the upsample and convolution operation. Meanwhile, five skip connections were used to fuse the multiple features, which can refine the segmentation results. Besides, only a few samples of the training set were given pixel-level labels of plant disease to supervise the output of the deconvolution module. At the end of the deconvolution module, the reconstruction layer was used to smooth the segmentation edge. To explore the influence of the number of pixel-level labels used in the model, 9 and 45 pixel-level labels were used to supervise the segmentation results, respectively. To simulate the natural conditions, different kinds of interference were added to the test data, such as translation, fruit occlusion, soil occlusion, leaf occlusion, and brightness reduction with different percentages. Experimentally, we evaluate our recognition module by exploring the performance of VGGNet, DGVGGNet-9, and DGVGGNet-45 on different interference datasets, respectively. We also evaluate our deconvolution module by exploring 4 different evaluation metrics, i.e., PA, MPA, MIoU, and FWIoU, compared with 3 popular semantic segmentation models, i.e., FCN-8s, U-Net, and SegNet.The results show that DGVGGNet-45 has the highest recognition accuracy as well as with the highest PA, MIoU, and FWIoU among the four segmentation evaluation metrics, which are 94.66%, 75.36%, and 90.46%, respectively. Compared with VGGNet, the deconvolution module of DGVGGNet-45 can guide the recognition module to pay more attention to the actual area of disease, which is effective in improving the segmentation accuracy. The recognition results demonstrate that DGVGGNet had strong robustness in tough conditions. Furthermore, DGVGGNet only took 12 ms to identify a single picture on the GPU, which can meet the real-time requirements.

反卷积引导的番茄叶部病害识别及病斑分割模型

Recognition and segmentation model of tomato leaf diseases based on deconvolution-guiding