LIANG Weijian, GUO Qingwen, WANG Chuntao, et al. Few-shot pest classification using spatial-attention-enhanced ResNeSt-101 network and transfer-based meta-learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2024, 40(6): 285-297. DOI: 10.11975/j.issn.1002-6819.202304013
    Citation: LIANG Weijian, GUO Qingwen, WANG Chuntao, et al. Few-shot pest classification using spatial-attention-enhanced ResNeSt-101 network and transfer-based meta-learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2024, 40(6): 285-297. DOI: 10.11975/j.issn.1002-6819.202304013

    Few-shot pest classification using spatial-attention-enhanced ResNeSt-101 network and transfer-based meta-learning

    • Pest recognition is a key foundation of pest management. Previous researches have exploited image classification to achieve automatic pest recognition. As it is difficult to obtain sufficient images for new-emerged pest classes, how to develop a pest classifier with a few labeled images is an interesting and challenging problem. Some existing works in the literature employ the matching network framework to solve this problem, which use meta-learning to avoid retraining deep networks. In these works, however, the feature extraction abilities of backbone networks are limited and the meta-learning algorithms do not provide a good weight initialization strategy or might cause network collapse. To close this gap, a few-shot pest classifier using a spatial-attention-enhanced version of ResNeSt-101 and a transfer-based meta-learning algorithm is proposed in this study. First, ResNeSt-101 is enhanced with a spatial attention block to better extract image features. The spatial attention block is suggested to integrate before the max pooling layer in the first stage of ResNeSt-101 and/or append at the end of stages 2-4, and the optimal location is determined as the first stage via the numerical simulation results. Subsequently, network weights are initialized by transfer learning and then optimized by meta-learning. To avoid network collapse, the normalized temperature-scaled cross-entropy loss function instead of the triplet loss function is chosen in the meta-learning algorithm. Finally, pest classification is achieved by computing similarities between deep features of query and support images. In addition, the proposed method is evaluated on two elaborately constructed pest image datasets AD0 and MIP50 with N-way K-shot accuracy and the time of per image processing (TPIP). These two pest image datasets are constructed as follows: images in the public pest image datasets, IP102 and D0, are firstly cleaned by eliminating the images with class ambiguities due to the English pest name-based categorizing; and the images of eggs, larvae, and pupae stages are removed while those of adults remained. Considering the limitation of human resources and time costs, only 50 classes are then selected from the cleaned IP102 pest dataset to construct the MIP50 pest image dataset. Subsequently, pest images are finally searched by the Latin pest names from the Internet, yielding the AD0 pest image dataset. The elaborately constructed MIP50 includes 16424 adult pest images from 50 categories of IP102, and the AD0 consists of 17112 adult pest images from all 40 categories of D0. Extensive experimental simulation results show that when there are only a few unseen pest categories in the test set, the 5-way 10-shot accuracy evaluation method achieves an accuracy of 96.37% on the AD0 dataset and 76.91% on the MIP50 dataset. When there exist several unseen and seen pest classes in the test set, the proposed method using the 5-way 10-shot setting achieves an accuracy of 93.73% on the AD0 dataset and 90.60% on the MIP50 dataset. The TPIP of the proposed method is approximately 0.44 ms, which satisfies the real-time pest recognition requirement in most scenarios. In addition, a series of comparative and ablation experiments suggest that the proposed method is effective in few-shot pest classification. These results indicate that the proposed few-shot pest classification using spatial-attention-enhanced ResNeSt-101 network and transfer-based meta-learning is effective and thus promising in practical applications. Although the proposed scheme is promising, there exist several issues that need to be further investigated in future work. For example, increasing the way number would probably lead to lower classification accuracy, optimizing the metrics in this work by adopting the metric that better characterizes the complex relationships between samples in the support set and those in the query set, and applying the proposed scheme to practically recognize pests in fields.
    • loading

    Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return