Abstract:
Chestnut industry has been confined to low productivity, high labor expenses, and operational hazards. Information technology can be expected to deal with these obstacles in the chestnut production. Fruit object detection can also offer essential technical assistance and solutions to the intelligent division of chestnut harvesting. In this study, a chestnut fruit object detection (YOLOv8-PBi) was proposed using the lightweight improved YOLOv8s. The research object was taken in the natural environment of Hubei Province, China. This approach brought various advancements, where the YOLOv8 was used as the basis network. Initially, the C2f-PConv module was constructed to minimize the floating-point number and computation amount using partial convolution (PConv), in order to improve the utilization of computational capacity. A weighted bidirectional feature pyramid network (BiFPN) was then employed to enhance the cross-scale connections and feature fusion performance. Thirdly, the convergence speed and detection performance were enhanced to change the bounding box loss function, in order to employ the dynamic non-monotone focusing technique Wise intersection over union (WIoU). Lastly, transfer learning was used to raise the accuracy and generalizability of the model. The enhanced YOLOv8-PBi model was scored by 89.4%, 74.9%, and 84.2% in the accuracy, recall, and average precision, respectively. The model weight was 46.22% less, compared with the original base network YOLOv8s. The gains were observed in the accuracy, recall, and average precision of 1.3, 1.5, and 1.8 percentage points, respectively. Additionally, the PC inference speed increased from 81.9 to 108 frames per second. The average detection accuracy increased by 0.8 percentage points after substituting the original bounding box loss function, CIoU, for WIoU. Both the gradient descent and fitting speeds increased significantly. BiFPN was replaced with the original feature fusion network, in order to increase the model recall rate by 0.9 percentage points. The model was allowed to concentrate more precisely on the attributes specific to chestnut fruit objects. The enhanced performance was achieved in the C2f-PConv approach, with an average accuracy of 3.5, 9.0, 7.5, and 9.9 percentage points greater than the standard lightweight feature extraction networks of MobileNetV3, MobileNetV2, GhostNet, and ShuffleNetV2. The enhanced WIoU loss function demonstrated that the WIoU had the lowest loss value and the fastest iterative convergence speed, compared with the CIoU, EIoU, and SIoU. The model size was the smallest, compared with the mainstream object identification models, such as SSD, Fast-RCNN, YOLOv5s, YOLOv5m, YOLOv7-tiny, and YOLOv8s. The average accuracy was found to be 21.33, 47.82, 4.4, 1.9, 6.2, and 1.8 percentage points greater than those of the other models, respectively. The model was finally deployed on the edge-embedded devices using TensorRT acceleration. The better performance was achieved in the detection frame rate of 43 frames per second and a detection speed of 23.26 ms, fully meeting the device deployment requirements. YOLOv8-PBi enhanced the misdetection and missed detection scenarios in the environments of backlight, overcast, and occlusion, compared with YOLOv8s. This finding can provide a technical foundation to identify the chestnut fruits in the process of intelligent chestnut harvesting.