Abstract
Abstract: Diseases and pests have posed a huge loss to agricultural production in recent years. Food losses that resulted from pests and diseases can be greater than 10% in the world, and even up to 30% in local areas, according to the latest statistics from the World Food and Agriculture Organization (FAO). It is a high demand to early identify the pests and diseases for the early warning, prevention, and control. However, an accurate identification of large-scale pests and diseases still remains a great challenge, due to the wide variety of pests and diseases and trait characteristics, such as high inter-class similarity and high intra-class variability of external morphology. This study aims to effectively extract and characterize the subtle features between categories for the large-scale multi-category pest classification and recognition task. A typical fine-grained classification was first established using the convolutional block attention mechanism, which was a visual attention mechanism module with better performance in the benchmark network. The key features were extracted to represent in both channel and spatial dimensions via blending across the feature channel and spatial domain. The feature dimension information with a high contribution rate to the pest classification task was extracted in the channel domain, while, the location dimension information with a high contribution rate was extracted in the spatial domain, where the benchmark network was achieved a fine-grained differential enhancement and representation. Secondly, the multi-scale feature extraction and representation were also particularly important for the fine-grained classification tasks. Since there were the different sizes and shapes of individuals to be recognized in fine-grained classification tasks, the sensory field scales were fixed for the same layer of convolutional kernels, indicating the influencing feature extraction when the sensory fields did not match the individuals. Therefore, a cross-layer nonlocal module was introduced into the benchmark model, in order to select a deeper layer and multiple shallow layers between multiple feature extraction layers of the benchmark network model. As such, the spatial response relationships were established to learn more multi-scale features for the improved feature extraction and representation capability of the benchmark model. Thirdly, a probability of default (PD)-Net model was also built in a large-scale multi-category fine-grained pest and disease network. A total of 800 images were collected, including the cherry powdery mildew general and severe, grape black rot general and severe, tomato spotted wilt general and severe, as well as citrus yellow dragon disease general and severe. Finally, a pest identification APP was developed using the PD-Net model. A pest and disease identification APP was developed on 61 categories of disease datasets and 102 categories of pests. After that, the 88.617% and 98.922% accuracies were achieved for the Top1 and Top2 recognition on the disease dataset, respectively, the 74.668% and 83.298% accuracy for the Top1 and Top2 recognition on the pest dataset. Respectively, compared with the AlexNet, VGG16, GoogleNet, Inception-v3, and DenseNet121 deep learning models. More importantly, the Top1 and Top2 recognition accuracies were improved by 1.748 percentage points to 4.331 percentage points, and 1.469 percentage points to 6.076 percentage points on the disease dataset, respectively. By contrast, the Top1 and Top2 recognition accuracies were improved by 1.906 percentage points to 8.122 percentage points, and 1.869 percentage points to 6.644 percentage points on the pest dataset, respectively. Meanwhile, the loss value, accuracy, precision, recall rate, and F1 were all improved, compared with the others. The experimental data was selected to verify the effectiveness of the PD-Net model. Consequently, a fine-grained recognition model can be widely expected for large-scale multi-category pests and diseases.