Fine-grained identification of crop pests using an enhanced ConvNeXt model
-
Graphical Abstract
-
Abstract
Precise and rapid identification of diverse pest species can greatly contribute to crop disease prevention and control in modern agriculture. However, the accuracy of pest identification has been frequently confined to the varied insect stages during different pest growth. Among them, the same pest can display distinctly different morphological features across various growth stages, while the different pests can exhibit similar morphologies in the same developmental periods. Both manual identification and machine learning approaches can often struggle to fully meet these demands of complex recognition. In this study, fine-grained identification was performed on the crop pests using an enhanced ConvNeXt model. A series of experiments were also carried out on the large-scale pest dataset with the morphological diversity of insects. The large-scale dataset contained 102 pest categories and 51,670 images representing 369 classes of pests at different stages. The largest dataset was focused mainly on the whole stages of insects; Each image was precisely labelled with the pest species and their developmental stages. A robust foundation was provided for the subsequent morphological studies. Furthermore, the ConvNeXt V2 was adopted as the baseline model. A multi-stage co-supervision strategy was then introduced to optimize the structure for better feature variability of the same species across different pest stages, as well as the significant inter-species differences. Two independent streams of neural networks were also constructed during optimization. Specifically, the species-specific features were learned by the feature extraction module within the ConvNeXt Block. While the shared features were derived through the first residual block of ResNet50, and then shared with the subsequent parameters of layers. A feature fusion module was then employed to effectively integrate these shared and species-specific features. A deep feature fusion was also designed to enhance the overall performance of recognition. Moreover, there were pronounced morphological differences among various pest species, leading to the varying spatial location. Therefore, the spatial attention module was further introduced to improve the sensitivity of the model to the spatial distribution of the target. Comparison experiments were conducted on the large public dataset IP102. The results demonstrate that the accuracy and F1 score of the model was improved by 3.67 and 2.49 percentage points, respectively, compared with the state-of-the-art models. Meanwhile, the corresponding metrics were improved by 5.07 and 5.48 percentage points, respectively, on the Age AP dataset. There were increases of 2.06 and 0.59 percentage points, respectively, on the CPB dataset. Ablation experiments show that the accuracy increased by 3.81 percentage points, compared with the original baseline model, when only the multi-stage co-supervision was adopted; The spatial attention module was raised by an additional 2.76 percentage points; These strategies ultimately improved the accuracy by 5.77 percentage points, compared with the original model. Significant improvements were achieved in the feature extraction and spatial information capture across multiple pest stages. Better performance was also achieved, compared with similar models. A reliable scheme was also presented for crop pest identification in smart agriculture.
-
-