Detecting pepper cluster using improved YOLOv5s
-
Graphical Abstract
-
Abstract
Efficient detection of pepper clusters can be crucial to realize the automated harvesting in the development of pepper industry. However, the existing detection of pepper cluster cannot fully meet the large-scale in recent years, due to the low efficiency, generalization, and real-time algorithm. In this study, an improved YOLOv5s model was proposed to accurately detect the pepper clusters in real time. 400 images were collected from the Xuande Pepper Industrial Park, Qinglong County, Guizhou Province, Qianxinan Buyi and Miao Autonomous Prefecture, Guizhou Province, China. 148 images were also downloaded from the Internet, and then screened to enhance data diversity. The collected 548 images were then labeled using the LabelImg tool. In addition, the data enhancement was performed on each image and labeled bounding box. Two new images were randomly generated using random rotation and brightness adjustment. As such, a total of 1644 images were obtained as the dataset. Firstly, the EfficientViT network was optimized using the lightweight convolutional neural network (CNN) MobileNetV3 Block Convolution (MBConv) module and the ReLU with lightweight self-attention mechanism. CSPDarkNet53 network was used to replace the backbone of YOLOv5s. The global receptive field of model was enhanced to extract the important features. The better performance was suitable for the targets with the different scales, shapes, and background information. Secondly, optimal transport assignment (OTA) was selected as the label assignment strategy. The optimal label assignment plan was obtained from the global perspective to achieve the effective label assignment. The robustness and accuracy of the model were improved for the complex scenes and target distribution. The WIoU loss function was also used as the bounding box localization of the detector, instead of the CIoU one. At the same time, the high-quality anchor frames were achieved to generate the benefit gradients for the better overall performance of the detector. A common experiment of target detection was performed on a homemade dataset, in order to verify the effectiveness of the improved model. The experimental results show that the improved model was achieved in a mean accuracy (mAP) of 97.3%, a parameter count of 5.9 M, and a detection speed of 131.6 frames per second (FPS). Specifically, the mAP and detection speed of the improved YOLOv5s model increased by 1.9 percentage points, and 14.5%, respectively, whereas, the number of parameters decreased by 15.7%, compared with the original. In addition, the mAP of the improved YOLOv5s model was 8, 16.9, 8.6, and 1.5 percentage points higher than that of EffcientDet-D1, SSD512, RetineNet-R50, and YOLOXs, respectively. The better performance of the improved YOLOv5s was achieved in terms of detection accuracy, detection speed, and number of model parameters. Therefore, the improved YOLOv5s model can be expected to accurately and rapidly detect the pepper clusters.
-
-