Improved YOLOv8n object detection of fragrant pears
-
Graphical Abstract
-
Abstract
Fragrant pears have been one of the most favorite types in pear industry. However, manual picking cannot fully meet the demands of large-scale production, due to the labor intensity. Mechanical picking has the potential to significantly reduce the labor demands for the high productivity. Among them, the visual recognition has been one of the most key elements among the various picking robots. The complex orchard environment with the variable lighting, weather conditions, foliage obstructions and fruit overlap has also posed the significant challenges on both conventional image processing and machine learning-based object detection, thereby impeding the accurate identification of fragrant pears. This study aims to improve the accuracy and detection speeds in such unstructured settings. An optimized YOLOv8n-based object detection was introduced to specifically identify the fragrant pear. A dataset was also comprised of 4 500 images of fragrant pears. A variety of conditions were considered using image collection and data augmentation, such as different lighting and foliage obstructions within orchard environments. The Min-Max normalization was employed to consider the fragrant pear harvesting. A comprehensive evaluation was conducted on five lightweight YOLO versions: YOLOv3-tiny, YOLOv5n, YOLO6n, YOLOv7-tiny, and YOLOv8n. Different weights were assigned to optimize the model selection, in terms of the mAP, precision, recall, inference time, parameters and model size. The network structure was streamlined for the object detection of fragrant pear with YOLOv8n as the baseline. Four key areas were focused to optimize the model. Initially, the network backbone was redesigned to promote the network inference speed. The impact of the C2f structure on speed was assessed to replace the certain redundant C2f components. Subsequently, an effective PConv module was integrated into the detection head of network, in order to better recognize the obscured images. A weight-sharing strategy was also coupled with the detection head parameters to reduce the parameters. Additionally, the combination of simSPPF and Inner IoU was further augmented the inference speed and bounding box regression performance. Comparative trials were then conducted on the object detection of fragrant pear. The results revealed that the superior comprehensive performance was achieved in the YOLOv8n. A higher weighted score of 83.4% was obtained after the Min-Max normalization, compared with the YOLOv3-tiny, YOLOv5n, YOLOv6n, and YOLOv7-tiny. Thus, the baseline network was selected for the subsequent research. Optimization experiments were carried out with C2f structures. The replacement of C2f was identified in the 15th and 2nd layers of the network with Conv and Conv_Res modules, in order to enhance the computational efficiency. The optimization scheme was determined for the backbone network. Ablation experiments were conducted to verify the efficacy of the various improved modules. The refined YOLOv8n was achieved the superior accuracy and detection speed on the fragrant pear dataset, with a 0.4 and 0.5 percentage point increase in the F0.5 score and mean average precision, respectively, compared with the original YOLOv8n model. Detection speeds on GPU and CPU devices increased by 34.0% and 24.4%, respectively, indicating the rates of 99.4 and 15.3 frames per second, respectively. This high-precision and rapid detection can provide the valuable technical support to the real-time detection of fragrant pears in natural orchard environments. The improved model can also be expected to deploy into the fragrant pear picking robots.
-
-