Abstract:
Pepper is one of the most widely planted vegetables in China. Currently, the production of fresh peppers, such as field management and harvesting, faces the challenges of high labor intensity and low efficiency. To address these issues, the pepper industry is transitioning towards mechanization and intelligent production. The fast and accurate detection of pepper fruits in natural environment is of great significance for the automatic picking of peppers, and to address the problems of poor adaptive ability and low detection accuracy of the model under different light and occlusion conditions, this study proposed an improved pepper fruit detection model YOLOX_Pepper based on YOLOX. Firstly, a fusion-efficient channel CA (coordinate attention) attention mechanism was added to the YOLOX feature fusion network, which enhanced the ability of the model to capture key features of pepper fruits. Secondly, the convolution module in the feature fusion module of the backbone network was replaced with Deformable Convolutional DCNv2 (Deformable ConvNets v2), which improved the perceptual ability of the model in the case that the geometric features of pepper length, width, and aspect ratio show diversity due to branch and fruit occlusion. The experimental results showed that the improved YOLOX_Pepper model had mAP (mean average precision) of 93.30%, which was 3.99, 1.58, 3.19, and 2.84 percentage points higher than that of Faster R-CNN, YOLOv5, YOLOv7, and YOLOX, respectively, with an F1 score of 96%, and an average time for single-image detection of 0.026s. Under strong light conditions, the mAP of green and red pepper fruits of YOLOX_Pepper model were 69.16% and 89.67%, and the number of correctly detected green and red peppers were 83 and 304. Under shadow conditions, the mAP of green and red peppers of YOLOX_Pepper model were 77.21% and 90.42%, respectively, and the number of correctly detected green and red peppers were 86 and 314 correctly detected. Under the lack of light conditions, the mAP of the YOLOX_Pepper model for green peppers and red peppers were 77.38% and 75.47%, and the number of correctly detected green peppers and red peppers were 119 and 255, respectively. The YOLOX_Pepper model showed some advantages in various light conditions, especially in the number of detections and detection accuracy, which was better than YOLOV5, YOLOV7 and YOLOX models. Under fruit occlusion conditions, the mAP of YOLOX_Pepper were 71.15% and 94.87% for green and red peppers, respectively, and the number of correct detections were 79 and 650 for green and red peppers, respectively. Under branch and foliage occlusion conditions, the mAP of YOLOX_Pepper were 83.98% and 87.10% for green and red peppers, respectively, and the number of correctly detected green and red peppers were 88 and 394, respectively. Under different occlusion conditions, the improved YOLOX_Pepper model performed well in pepper fruit detection, which was better compared to YOLOv5, YOLOv7 and YOLOX for pepper fruit detection. The YOLOX_Pepper model showed excellent detection performance in complex environments, which proved the effectiveness of the improved module, and provided intelligent production of peppers with reliable technical support.