Abstract:
Visually guided grabbing has been widely used for picking camellia oleifera fruits at present. However, picking robots cannot directly reach the camellia oleifera fruits that are severely obstructed by the stems, even damaging the fruit tree or clamping devices. In this study, the classification and recognition were proposed for the camellia oleifera fruits using transfer learning and YOLOv8n, in order to reduce the manual annotation workload for the detection accuracy of the model. Camellia oleifera fruits were also divided into two categories: no occlusion and occlusion. Firstly, transfer learning was designed using the COCO128 dataset as the source domain, the publicly available apple dataset as the auxiliary domain, and the self-made camellia oleifera fruit dataset as the target domain. The YOLOv8n model was then trained to recognize the camellia oleifera fruits. Secondly, a total of 52 ablation experiments were performed using two learnings (transfer learning and random weight initialization training), three training datasets (consisting of 200, 400, and 732 training images), three learning rates (0.001, 0.005, and 0.01), and three training rounds (50, 100, and 200 rounds). A systematic investigation was then made to explore the impact of these four factors on the detection performance of the YOLOv8n model. Finally, YOLOv8n was compared with the models, such as YOLOv3 tiny, YOLOv5n, and YOLOv7 tiny. The experimental results show that the training mode initialized with random weights was very sensitive to the amount of training data and the size of the learning rate. The larger the learning rate was, the faster the convergence speed of the model was, with an optimal value of 0.01. The larger the dataset was, the higher the average accuracy of the model was, but there were marginal effects. The training of transfer learning significantly accelerated the convergence speed of the model. The peak detection accuracy was achieved within the first 50 rounds of training. The mean average accuracy was improved by 2.0-24.0 percentage points, compared with the random weight initialization. The transfer learning was achieved in the mean average accuracy equivalent to the random weight initialization with only 1/2 of the training data, which greatly reduced the manual annotation time. The mean average accuracy of the YOLOv8n model reached the maximum of 92.7% under the transfer learning, which was 1.4 percentage points higher than that with random weight initialization. Compared with the lightweight YOLO series models, such as YOLOv3 tiny, YOLOv5n, and YOLOv7 tiny, the mean average accuracy of the YOLOv8n model was 24.0, 1.7, and 0.4 percentage points higher, respectively, indicating the best detection performance. The experiments also verified the performance of the YOLOv8n model and the effectiveness of the transfer learning. The finding can also provide a strong reference to optimize the training parameters of the YOLOv8n model for better classification and recognition of camellia oleifera fruits.