基于自适应BayesShrink和频-空特征融合的作物病害识别方法

    A crop disease identification method based on adaptive BayesShrink and frequency-spatial feature fusion

    • 摘要: 在现有作物病害识别方法中,有两个因素影响了模型的性能:1)图像采集中存在大量噪声,极易降低识别网络的准确率;2)真实环境下采集的病害图像具有复杂背景,且背景与病害区域差异较小,严重影响模型的准确性和泛化性。针对此,该研究提出一种基于自适应BayesShrink和频-空特征融合的作物病害识别方法(adaptive BayesShrink and frequency-spatial domain features fusion, AFSF-DCT)。首先,设计了一种自适应BayesShrink算法(Ad-BayesShrink)以减少噪声干扰的同时保留更多细节信息,降低识别网络提取病害特征的难度。然后提出了基于频-空特征融合和动态交叉自注意机制的作物病害识别模型(crop leaf disease identification model based on frequency-spatial features fusion and dynamic cross-self-attention, FSF-DCT)。为实现全面的频-空特征映射,设计了基于离散小波变换(discrete wavelet transform,DWT)和倒残差结构(bneck)的频-空特征映射(DWT-Bneck)分支以捕获病害图像的多尺度频-空特征。频域分支设计了一种基于2D DWT的频域特征映射模块(2D DWT-based frequency-features decomposition module, DWFD)以捕获病害图像的细节和纹理特征,用于补充空间域特征在全局信息表达上的不足。空间域分支在bneck中引入CBAM(convolutional block attention module)和Dynamic Shift Max激活函数以实现全面的空间特征映射。最后设计了动态交叉自注意特征融合网络(multi-scale features fusion network based on dynamic cross-self-attention, MDCS-DF)以融合频-空特征并增强网络对病害特征的关注。结果表明,Ad-BayesShrink获得了35.78的最高峰值信噪比,优于VisuShrink和SUREShrink。FSF-DCT在自建数据集和两个开源数据集上分别获得了99.20%,99.90%和90.75%的识别精度,且具有较小的参数量(7.48 M)和浮点运算数(4.62 G),优于当前大部分的主流识别模型。AFSF-DCT可为复杂背景下的作物叶片病害的快速精准检测提供模型参考。

       

      Abstract: Crop diseases seriously affect grain yield and quality. Timely and accurately identifying crop diseases can effectively prevent disease spread and reduce losses. Existing crop disease identification methods face two main challenges: 1) Much noise during image acquisition easily reduces the model's identification accuracy. 2) Complex backgrounds of disease images captured in natural environments and subtle distinctions between disease areas and backgrounds seriously affect the model's accuracy and generalizability. To address these issues, this paper proposed a disease identification method using adaptive BayesShrink and frequency-spatial domain features fusion (AFSF-DCT) for crop disease identification in complex backgrounds, including two parts: disease image denoising and identification. Traditional denoising algorithms tend to lose most of the details while removing image noise. Therefore, they are less applicable to the image-denoising task with complex backgrounds. To solve this problem, this paper designed an adaptive BayesShrink denoising algorithm (Ad-BayesShrink) based on adaptive global and local thresholds. Ad-BayesShrink minimizes noise interference while retaining more detailed information, reducing the model's difficulty extracting disease features. It uses Daubechies8 Discrete Wavelet Transform (Db8 DWT) to capture the texture and color details of different disease regions from the frequency domain. Large-scale low-frequency sub-bands were denoised using global thresholding, while high-frequency sub-bands were processed using local thresholding to retain image details and edge information. Denoised low-frequency and high-frequency components were reconstructed using Inverse Discrete Wavelet Transform. AFSF-DCT designed a crop disease identification model (FSF-DCT) using frequency-spatial feature fusion and dynamic cross-self-attention to identify crop leaf disease with complex backgrounds. FSF-DCT takes MobileNetV3 as its backbone, containing frequency-spatial features mapping and fusion. A frequency-spatial domain feature mapping (DWT-Bneck) branch based on DWT and Inverse Residual Structure (Bneck) was proposed for capturing multi-scale features from both frequency-spatial domains. In the frequency-domain feature mapping branch, a frequency decomposition module (DWFD) utilized 2D DWT and Bottleneck modules to capture disease image details and texture features, compensating for the insufficiency of spatial-domain information expressing global features. Based on Bneck structure and CBAM (Bneck-CBAM), the spatial domain branch enhanced FSF-DCT's capacity for feature representation in both channel and spatial dimensions. It enables FSF-DCT to capture long-range dependencies in spatial directions and more precise positional information of disease features, thus realizing comprehensive spatial feature mapping. To further improve FSF-DCT's nonlinearity and ability to learn disease features, the Dynamic Shift Max activation function was embedded in Bneck-CBAM to replace the ReLU6 activation function. Finally, a Dynamic Cross Self-Attention Feature Fusion Network (MDCS-DF) was designed to fuse multi-scale frequency-spatial domain features and enhance FSF-DCT's focus on disease features. MDCS-DF uses a series of horizontal and vertical convolutions to align the frequency-spatial domain features to a uniform scale. Dynamic cross-self-attention assigns varying weights to disease features based on their attributes, thus efficiently fusing multi-scale frequency-spatial features while enhancing FSF-DCT's focus on disease regions and reducing the influence of complex background on disease identification. The denoising experiments showed that Ad-BayesShrink outperformed VisuShrink and SUREShrink with a higher PNSR (32.26), SSIM (0.98), and a lower MSE (125.80). These results demonstrated that Ad-BayesShrink effectively removed image noise while preserving as much detailed information as possible, such as the image's texture and edges. Ad-BayesShrink still achieved a higher PNSR (31.39) under low-light conditions, which indicated that it could effectively deal with the effect of low light on the disease image in practical applications. Experiments on the self-built dataset showed that FSF-DCT achieved an identification accuracy of 99.20% and a precision of 98.89%, outperforming most classical models and state-of-the-art. These results highlighted FSF-DCT's superior ability to accurately identify crop leaf diseases under complex backgrounds in natural environments. Meanwhile, FSF-DCT's interference time was only 6ms, making it suitable for deployment on resource-constrained devices. Generalizability experiments further showed that FSF-DCT achieved the highest identification accuracies on the PlantVillage (99.90%) and AI Challenger 2018 (90.75%) datasets compared to three mainstream models (MobileNetV3, Swin Transformer, and Vision Transformer). Its precision (99.88% and 90.77%), recall (99.85% and 90.79%), and F1-score (99.81% and 90.88%) were optimal, showcasing its significant advantage over similarly sized models. FSF-DCT improved accuracy by over 3.28 percentage points compared to the original MobileNetV3. Compared to Swin Transformer and Vision Transformer, FSF-DCT improved identification accuracy by at least 4 percentage points with fewer FLOPs and parameters. Results from the open-source datasets confirmed FSF-DCT's strong generalizability in identifying multiple crop diseases with complex feature distributions. AFSF-DCT can be expected to reduce image noise and accurately and quickly identify crop leaf disease in complex backgrounds.

       

    /

    返回文章
    返回