基于自适应BayesShrink和频-空特征融合的作物病害识别方法

向玉云; 陈思远; 高爽; 李书琴

doi:10.11975/j.issn.1002-6819.202410109

基于自适应BayesShrink和频-空特征融合的作物病害识别方法

Crop disease identification method based on adaptive BayesShrink and frequency-spatial feature fusion

摘要

摘要: 针对真实环境下采集的病害图像中存在的大量噪声和复杂背景干扰，导致作物病害识别准确性和泛化性低的问题，该研究提出基于自适应BayesShrink和频-空特征融合的作物病害识别方法（adaptive BayesShrink and frequency-spatial domain features fusion, AFSF-DCT）。首先，设计了自适应BayesShrink算法（Ad-BayesShrink）以减少噪声干扰，同时保留更多细节，降低识别模型提取病害特征的难度。然后提出基于频-空特征融合和动态交叉自注意机制的作物病害识别模型（crop leaf disease identification model based on frequency-spatial features fusion and dynamic cross-self-attention, FSF-DCT）。为实现全面的频-空特征映射，设计了基于离散小波变换（discrete wavelet transform，DWT）和倒残差结构（bneck）的频-空特征映射（DWT-Bneck）分支以捕获多尺度病害特征。频域分支设计了基于2D DWT的特征映射模块（2D DWT-based frequency-features decomposition module, DWFD）以捕获病害细节和纹理，用于补充空间域特征在全局信息表达上的不足。空间域分支在bneck中引入CBAM（convolutional block attention module）和Dynamic Shift Max激活函数以实现全面的空间特征映射。最后设计了动态交叉自注意特征融合模块（multi-scale features fusion network based on dynamic cross-self-attention, MDCS-DF）融合频-空特征并增强模型对病害特征的关注。结果表明，Ad-BayesShrink获得了35.78的最高峰值信噪比，优于VisuShrink和SUREShrink。FSF-DCT在自建数据集和2个开源数据集（PlantVillage和AI challenger 2018）上分别获得了99.20%、99.90%和90.75%的识别精度，且具有较小的参数量（7.48 M）和浮点运算数（4.62 G），优于当前大部分的主流识别模型。AFSF-DCT可为复杂背景下的作物叶片病害的快速精准检测提供模型参考。

Abstract: Crop diseases have posed the serious threat to grain yield and quality. Timely and accurately identifying crop diseases can effectively prevent disease spread and grain loss. Two challenges still remain in the existing identification of crop disease: 1) Much noise during image acquisition can easily reduce the accuracy of identification. Traditional denoising algorithms tend to lose most of the details while removing image noise. 2) The accuracy and generalizability are also confined to the complex backgrounds of disease images that are captured in natural environments, as well as the subtle distinctions among disease areas and backgrounds. Therefore, they are less applicable to the image-denoising task with complex backgrounds. In this study, a disease identification was proposed using adaptive BayesShrink and frequency-spatial domain features fusion (AFSF-DCT). Two parts were also included for the crop disease identification in complex backgrounds: disease image denoising and identification. An adaptive BayesShrink denoising algorithm (Ad-BayesShrink) was designed using adaptive global and local thresholds. Ad-BayesShrink was used to minimize the noise interference while retaining more detailed information. Disease features were easily extracted during identification. Daubechies8 discrete wavelet transform (Db8 DWT) was utilized to capture the texture and color details of different disease regions from the frequency domain. Large-scale low-frequency sub-bands were denoised using global thresholding. While the high-frequency sub-bands were processed using local thresholding, in order to retain the image details and edge information. Denoised low-frequency and high-frequency components were reconstructed using inverse discrete wavelet transform. A crop disease identification model (FSF-DCT) was also designed using frequency-spatial feature fusion and dynamic cross-self-attention, in order to identify the crop leaf disease with the complex backgrounds. MobileNetV3 was taken as the FSF-DCT backbone, including the frequency-spatial features mapping and fusion. A frequency-spatial domain feature mapping (DWT-Bneck) branch and Inverse Residual Structure (Bneck) were proposed to capture the multi-scale features from both frequency-spatial domains. In the frequency-domain feature mapping, a frequency decomposition module (DWFD) with 2D DWT and Bottleneck modules were utilized to capture the disease image and texture features. A better performance was achieved to enhance the spatial-domain information expressing global features. According to the Bneck structure and CBAM (Bneck-CBAM), the spatial domain branch enhanced the FSF-DCT's capacity for the feature representation in both channel and spatial dimensions. Long-range dependencies were then captured in spatial directions. Thus, the comprehensive spatial feature mapping was realized for the more precise positions of disease features. Dynamic Shift Max activation function was embedded in the Bneck-CBAM to replace the ReLU6 activation function. The FSF-DCT's nonlinearity was further improved to learn the disease features. Finally, a Dynamic Cross Self-Attention Feature Fusion Network (MDCS-DF) was designed to fuse the multi-scale frequency-spatial domain features. A series of horizontal and vertical convolutions were used to align the frequency-spatial domain features into a uniform scale. Dynamic cross-self-attention was assigned in the varying weights of disease features using their attributes, thus efficiently fusing multi-scale frequency-spatial features. Disease regions were focused on reducing the complex background during identification. The denoising experiments showed that the Ad-BayesShrink outperformed VisuShrink and SUREShrink with a higher PNSR (32.26), SSIM (0.98), and a lower MSE (125.80). Image noise was effectively removed to preserve many details of texture and edges. A higher PNSR (31.39) was also achieved in the disease image from the low light in practical application. Experiments on the self-built dataset showed that FSF-DCT was achieved with an identification accuracy of 99.20% and a precision of 98.89%, outperforming most classical models. Therefore, the FSF-DCT can be expected to accurately identify crop leaf diseases under complex backgrounds in natural environments. Meanwhile, the FSF-DCT's interference time was only 6ms suitable for the deployment on the resource-constrained devices. Generalizability experiments further showed that the FSF-DCT was achieved in the highest accuracies on the PlantVillage (99.90%) and AI Challenger 2018 (90.75%) datasets, compared with three mainstream models (MobileNetV3, Swin Transformer, and Vision Transformer). Its precision (99.88% and 90.77%), recall (99.85% and 90.79%), and F1-score (99.81% and 90.88%) were optimal over the similarly sized models. The FSF-DCT improved the accuracy by over 3.28 percentage points, compared with the original MobileNetV3. The accuracy was improved by at least 4 percentage points with fewer FLOPs and parameters, compared with the Swin Transformer and Vision Transformer. The open-source datasets also confirmed that the strong generalizability of FSF-DCT was achieved to identify the multiple crop diseases with complex feature distributions. The AFSF-DCT can be expected to reduce the image noise, and then accurately and rapidly identify the crop leaf disease in complex backgrounds.

HTML全文

参考文献(38)

施引文献

资源附件(0)