Detecting plum fruits in orchard using lightweight improved YOLOv8s
-
Graphical Abstract
-
Abstract
Plum fruit is often requiring timely harvesting during specific seasons, due to the significant nutritional and culinary value. However, it is still challenging to accurately detect the plums in real-world orchard environments, such as the shading from foliage, and the overlapping of fruits. These influencing factors have also posed the higher demands on intelligent harvesting, in terms of speed, accuracy and real-time performance. In this study, an efficient and reliable fruit detection model was proposed to fully meet the specific needs of the plums in the complex orchard environments using an enhanced and lightweight version of YOLOv8s. Firstly, the backbone network (named Faster-EMA) was developed to reduce the overall complexity of the model with the high detection accuracy. The architecture of the backbone network was optimized to more effectively extract the critical features from the input images, even when the plums were occluded by branches or surrounded by other fruits. The optimal network structure was then achieved for the accurate detection with the few computational resources. Secondly, Focal Modulation was introduced to replace the Spatial Pyramid Pooling (SPPF) module in original YOLOv8s. The multi-scale features were integrated to detect the plums at different sizes under varying environmental conditions. The semantic information was also captured during feature extraction. The key aspects of the plums were then focused (such as the shape and color), rather than the less relevant background elements. The feature fusion mechanism was significantly enhanced the overall performance of the model. Thirdly, the parameter-sharing strategy (LDetect) was introduced to implement the lightweight detection head. The parameters were shared across different branches of the detection head, in order to maintain the high detection performance while significantly reducing the number of parameters and computational complexity. This lightweight detection head was designed specifically to meet the deployment requirements of low-power embedded devices, such as edge computing platforms. The efficiency of the model was particularly advantageous for the real-time applications, indicating the rapid detection and decision making. The experiment was also validated the effectiveness of the improved model. In terms of the average detection accuracy, the mean Average Precision (mAP) reached an impressive 97.2%, which was a 7.4 percentage point over the baseline YOLOv8s model. Additionally, the optimal model was reduced the computational load, with a 44.8% decrease in Floating Point Operations (FLOPs), and a 25.8% reduction in the number of model parameters. The more efficient model was achieved with the processing time and memory usage. The detection frame rate was achieved 48.3 fps, when the improved model was deployed on the Jetson Nano 4GB (low-power edge computing device), thus enabling real-time detection even in resource-constrained environments. In conclusion, the lightweight model can be expected to effectively detect the plums under the complex orchard backgrounds, environmental variations and occlusions. Both high accuracy and computational efficiency were achieved to incorporate the Faster-EMA, Focal Modulation, and the lightweight detection head (LDetect). The successful deployment of this model on the edge computing devices can represented the significant step toward to the intelligent plum harvesting. The finding can provide a viable solution to the automated fruit picking in dynamic orchard environments. The valuable insights can also offer for the real-time detection in agricultural robotics and precision farming.
-
-