Abstract:
Pak choi (Chinese cabbage) has been one of the most popular vegetables with a wide planting area in China. Rapid detection and accurate identification of Pak choi insect infestation is of great significance to ensure the safety of vegetable supply. However, the insect pests can often appear in specific areas of leaves under the real vegetable field environment. The relatively large light and background interference posed a great challenge to the detection efficiency and accuracy. In this study, an improved YOLOPC model was proposed to identify Pak choi pests using the YOLOv5s network framework. Firstly, the CBAM (Convolutional Block Attention Modul) was introduced to place on the input end of CBS (Convolution layer + normalization layer Batch normalization layer + activation function layer SILU). The structure of CBAM-CBS was formed to dynamically adjust the weights of each channel and spatial position in the feature graph. The upsample and 1×1 convolution operations were used to adjust the size and number of channels of the feature graph, in order to realize the fusion of features at different levels. The feature representation of the model was enhanced at the same time. The loss function was improved more suitable for the accuracy of bounding box regression. The void convolution was used to improve the receptive field range of the network, in order to better learn the context information of the image. Specifically, the improvement strategy included the following three aspects: 1) The attention mechanism of space and channel was added to extract the network feature, in order to better learn the cabbage diamondback moth and leaf miner insect pests; 2) The alpha-IoU loss function was used to replace the CIoU one in YOLOv5s. Different levels of boundary box regression accuracy were adapted for the insect targets of cabbage, broccoli moth, and leaf leaf-divers at different scales and aspect ratios; 3) Atrous Spatial Pyramid Pooling (ASPP) was introduced to improve the receptive field range of the network, in order to better learn the context information of the image. The test results showed that the mean average precision (mAP) of the YOLOPC model was 91.4% with an increase of 12.9 percentage points, compared with the improved YOLOv5s model. The frame per second (FPS) was 58.82 frames/s, indicating an increase of 11.2 frames/s, or 23.53%. The number of parameters was only 14.4M with a decrease of 5M, or 25.78%. Furthermore, the average accuracy of the YOLOPC model was 20.1, 24.6, 14, 13.4, and 13.3 percentage points higher, respectively, Compared with the target detection of SSD, Faster R-CNN, YOLOv3, YOLOv7, and YOLOv8. Significant advantages were achieved in the accuracy, recall rate, frame rate, and number of parameters. This improved model can provide technical support for the rapid and accurate detection of Pak choi pests under a complex background. In short, the accuracy and immediacy of the improved model were significantly optimized in the field of pest detection of Pak choi. YOLOPC model also shared significantly improved parameters, detection rate, and accuracy, compared with the current mainstream ones. Therefore, the camera equipment can be deployed to identify and warn the pest of Pak choi for field planting. The improved mode can also be applied on the plant protection drone, in order to realize the precise spraying of variables and targeted pest control. The findings can provide a strong reference to saving pesticides and effectively reducing environmental pollution.