Abstract:
The ripeness of
Camellia oleifera fruits is closely related to their oil yield and tea oil quality. Manual one-time harvesting is the primary harvesting for Camellia oleifera at present. However, the uneven ripeness levels among fruits harvested in the same batch can significantly reduce the overall quality of the fruits. Furthermore, manual harvesting cannot fully meet the large-scale
Camellia oleifera industry, such as low efficiency and high costs. Therefore, it is very necessary to implement intelligent harvesting for
Camellia oleifera fruits. The maturity of
Camellia oleifera fruit can be detected to determine the best maturity using deep learning. The purpose of this study is to establish the ideal ripeness of
Camellia oleifera fruits in the natural environment. The harvesting period was also estimated to implement the intelligent harvesting, in order to improve the oil yield and quality of fruits. A dataset was constructed for the ripeness detection. The photographs of
Camellia oleifera fruits were captured at different ripening stages in natural environments using a smartphone. The phenotypic characteristics of
Camellia oleifera fruits were also determined to follow the industry standards. The ripeness was categorized into three stages: immature, mature, and over-mature. Data augmentation techniques were applied on the dataset, such as brightness adjustment, salt-and-pepper noise addition, and simulating artificial occlusion. After that, the dataset was divided into the training, validation, and testing sets with a ratio of 7:1:2. An improved YOLOv7 model was constructed to deal with the occlusion in the natural environment. A cross-attention module was added in the YOLOv7 feature extraction network. The vertical and horizontal information was calculated for each pixel in
Camellia oleifera images using two attention-weighted procedures. The key features were identified to determine the ripeness of
Camellia oleifera fruits, thus effectively avoiding the interference from the complex backgrounds, such as the branches and leaves. Additionally, the traditional non-maximum suppression was replaced with the distance- and intersection-based NMS. The normalized distance between the center points of two candidate boxes was selected to calculate the intersection ratio, particularly for the missed detections due to mutual occlusion of
Camellia oleifera fruits, in order to detect the overlapping fruits. The YOLOv7 model was trained on 3098 images from the training set, then evaluated using 442 images from the validation set, and finally tested on 885 images from the test set. The better performance was achieved in the precision rate of 93.52%, a recall rate of 90.25%, an
F1 score of 91.86%, an average precision of 94.60%, an average detection time of 0.77 s, and a model weight of 82.6 MB in the test set. The ablation experiments demonstrated that the improved model was used to effectively detect the ripeness of
Camellia oleifera fruit. Compared with the original YOLOv7 model, the mean average accuracies were improved by 1.10 and 1.81 percentage points, respectively. The overall mean average accuracy was improved by 2.91 percentage points. However, the detection time and model size only increased by 0.015 s and 11.3 MB, respectively. Compared with the Faster R-CNN, EfficientDet, YOLOv3, and YOLOv5l models, the improved YOLOv7 model increased the average accuracy by 7.51, 5.89, 4.21, and 4.21 percentage points, respectively. Additionally, the detection time was reduced by 1.06, 1.12, 0.10, and 0.03 s, respectively. The maturity grade of
Camellia oleifera fruits was accurately discriminated, compared with the previous. In summary, the improved YOLOv7 model was achieved in the higher accuracy with only a slight sacrifice in the detection time and model size. This finding can provide a theoretical basis to estimate the optimal harvesting period of
Camellia oleifera fruits and intelligent picking under natural conditions.