基于改进YOLOv7的油茶果实成熟度检测

陈锋军; 陈闯; 朱学岩; 沈德宇; 张新伟

doi:10.11975/j.issn.1002-6819.202311030

基于改进YOLOv7的油茶果实成熟度检测

Detection of Camellia oleifera fruit maturity based on improved YOLOv7

摘要

摘要: 为确保油茶果实处于最佳成熟度进行采摘，提高油茶果实的出油率及茶油品质，该研究针对自然环境下油茶果实多被遮挡的问题，以原始YOLOv7模型为基础进行改进，提出一种油茶果实成熟度检测方法。首先，在主干网络中引入十字交叉注意力机制（criss-cross attention，CCA）加强对被枝叶遮挡果实成熟度特征的提取能力；其次，使用基于距离和交并比的非极大值抑制（distance-iou non-maximum suppression，DIoU-NMS）算法代替传统非极大值抑制（non-maximum suppression，NMS）算法，从而加强模型对相互遮挡果实的检测能力；最后，以训练集中3098张油茶果实图像训练改进的YOLOv7模型，验证集中442张图像用于在训练过程中评估模型，并对测试集中885张图像进行测试。改进后的YOLOv7模型在测试集下的精确率P为93.52%，召回率R为90.25%，F₁分数为91.86%，平均精度均值mAP为94.60%，平均检测时间为0.77 s，模型权重大小为82.6 M。与Faster R-CNN、EfficientDet、YOLOv3、YOLOv5l和原始YOLOv7模型相比，平均精度均值mAP分别提升7.51、5.89、4.21、4.21和2.91个百分点。试验证明，改进的YOLOv7模型为实现油茶果实的智能化采摘提供理论依据。

Abstract: The ripeness of Camellia oleifera fruits is closely related to their oil yield and tea oil quality. Manual one-time harvesting is the primary harvesting for Camellia oleifera at present. However, the uneven ripeness levels among fruits harvested in the same batch can significantly reduce the overall quality of the fruits. Furthermore, manual harvesting cannot fully meet the large-scale Camellia oleifera industry, such as low efficiency and high costs. Therefore, it is very necessary to implement intelligent harvesting for Camellia oleifera fruits. The maturity of Camellia oleifera fruit can be detected to determine the best maturity using deep learning. The purpose of this study is to establish the ideal ripeness of Camellia oleifera fruits in the natural environment. The harvesting period was also estimated to implement the intelligent harvesting, in order to improve the oil yield and quality of fruits. A dataset was constructed for the ripeness detection. The photographs of Camellia oleifera fruits were captured at different ripening stages in natural environments using a smartphone. The phenotypic characteristics of Camellia oleifera fruits were also determined to follow the industry standards. The ripeness was categorized into three stages: immature, mature, and over-mature. Data augmentation techniques were applied on the dataset, such as brightness adjustment, salt-and-pepper noise addition, and simulating artificial occlusion. After that, the dataset was divided into the training, validation, and testing sets with a ratio of 7:1:2. An improved YOLOv7 model was constructed to deal with the occlusion in the natural environment. A cross-attention module was added in the YOLOv7 feature extraction network. The vertical and horizontal information was calculated for each pixel in Camellia oleifera images using two attention-weighted procedures. The key features were identified to determine the ripeness of Camellia oleifera fruits, thus effectively avoiding the interference from the complex backgrounds, such as the branches and leaves. Additionally, the traditional non-maximum suppression was replaced with the distance- and intersection-based NMS. The normalized distance between the center points of two candidate boxes was selected to calculate the intersection ratio, particularly for the missed detections due to mutual occlusion of Camellia oleifera fruits, in order to detect the overlapping fruits. The YOLOv7 model was trained on 3098 images from the training set, then evaluated using 442 images from the validation set, and finally tested on 885 images from the test set. The better performance was achieved in the precision rate of 93.52%, a recall rate of 90.25%, an F₁ score of 91.86%, an average precision of 94.60%, an average detection time of 0.77 s, and a model weight of 82.6 MB in the test set. The ablation experiments demonstrated that the improved model was used to effectively detect the ripeness of Camellia oleifera fruit. Compared with the original YOLOv7 model, the mean average accuracies were improved by 1.10 and 1.81 percentage points, respectively. The overall mean average accuracy was improved by 2.91 percentage points. However, the detection time and model size only increased by 0.015 s and 11.3 MB, respectively. Compared with the Faster R-CNN, EfficientDet, YOLOv3, and YOLOv5l models, the improved YOLOv7 model increased the average accuracy by 7.51, 5.89, 4.21, and 4.21 percentage points, respectively. Additionally, the detection time was reduced by 1.06, 1.12, 0.10, and 0.03 s, respectively. The maturity grade of Camellia oleifera fruits was accurately discriminated, compared with the previous. In summary, the improved YOLOv7 model was achieved in the higher accuracy with only a slight sacrifice in the detection time and model size. This finding can provide a theoretical basis to estimate the optimal harvesting period of Camellia oleifera fruits and intelligent picking under natural conditions.

HTML全文

参考文献(40)

施引文献

资源附件(0)