自然环境下基于改进YOLOv7的梨花识别方法

张秀花; 魏华杰; 孔德刚; 刘尚坤; 黄征; 王洪森

doi:10.11975/j.issn.1002-6819.202408209

自然环境下基于改进YOLOv7的梨花识别方法

Recognizing pear blossom in the natural environment using improved YOLOv7

摘要

摘要: 针对自然环境下梨花易被遮挡、背景杂乱、光照条件与目标距离不断变化等特点导致梨花识别难和精度不高的问题，该研究提出了一种基于改进YOLOv7模型的梨花识别算法。该算法首先加入P2小目标层，增加了特征提取与模型多尺度融合能力，使被遮挡的梨花目标更好地被捕获；其次，在输出检测端末尾加入CBAM（convolutional block attention module）注意力机制模块，提高模型的上下文理解能力，提升YOLOv7在各种场景下（不同光照条件、复杂背景等）的表现；最后，将CIoU（complete intersection over union）损失函数优化为NWD（normalized weighted distance）损失函数，针对不同形状的目标进行精确的边界框回归，提高模型对复杂背景梨花目标与远距离梨花目标的检测精度。试验结果表明：改进模型与原模型相比，精确率、召回率、mAP和F1-score分别提高了2.1、1.2、1.9和0.6个百分点，达到了99.4%、99.6%、96.4%和89.8%；与其他主流算法相比，各评价指标均有优势。研究结果可为梨园自然环境下梨花精准识别提供支撑。

Abstract: This study aims to improve the accuracy of object detection on the pear blossoms under complex conditions, particularly for the easily obscured, complex backgrounds, varying lighting, and target distances in natural environments. An improved recognition algorithm was proposed for the pear blossom using the YOLOv7 model. Firstly, a P2 small-object layer was added to increase the capability of feature extraction and multi-scale fusion of the model, so that the improved model was selected to better capture the obscured targets. Secondly, a CBAM (convolutional block attention module) attention mechanism was introduced at the end of the output detection layer. CBAM was used to improve the context understanding of the improved model and the performance of YOLOv7 in various scenarios (different lighting conditions, and complex backgrounds). Lastly, the CIoU (complete intersection over union) loss function was optimized to the NWD (normalized weighted distance) loss function. NWD shared the accurate bounding box regression and then performed better for targets with different shapes. The detection accuracy of the model was improved for the complex background targets and distant targets. Additionally, a dataset was created to photograph the pear blossoms from different angles, backgrounds, lighting conditions, and distances during the peak blooming period. The data collection was achieved in a total of 3,240 photos of Ya pear blossoms, 1,582 photos of Golden Pear blossoms, and 2,184 photos of New pear No. 7 blossoms. Ya pear images were used to create a complex environment dataset for pear blossoms, due to the large number of Ya pear samples and their rich background elements. Images of the three pear varieties were used to create a dataset for different varieties. The improved YOLOv7 model was trained and tested using the complex environment dataset. The images with Ya pear blossoms were randomly selected under different lighting conditions, occlusions, backgrounds, and distances. A comparison was also made on the original and improved YOLOv7 model. The results showed that the improved YOLOv7 model shared better detection performance and higher confidence levels. Ablation experiments were conducted to validate the effectiveness of the three improvements. These improvements significantly enhanced the original YOLOv7 model, thus increasing the detection accuracy of pear blossom targets. The detection heatmaps of the improved and original models showed that the heatmap values of the improved YOLOv7 model were closer to the actual pear blossom regions. The focus was also observed to extract the pear blossom edge features. The improved YOLOv7 model was trained and tested using different datasets of pear blossom, indicating excellent adaptability and robustness. Comparative experiments with mainstream algorithms were conducted to evaluate the performance of the improved YOLOv7 model. The precision, recall, mAP, and F1-score of the improved YOLOv7 model increased by 2.1, 1.2, 1.9, and 0.6 percentage points, respectively, reaching 99.4%, 99.6%, 96.4%, and 89.8%, compared with the original. Furthermore, the improved YOLOv7 model also showed better performance in all evaluation metrics, compared with the Faster R-CNN, SSD, YOLOv3, YOLOv4, YOLOv5, YOLOv8, YOLOv9, and YOLOv10 models. A series of experiments were conducted to evaluate the applicability and performance of the improved model. The mAP of the improved YOLOv7 was 3.9 and 3.7 percentage points higher than that of YOLOv7, when training close and distant datasets, respectively; When training the forward and backward light datasets, the mAP of improved YOLOv7 was 4.4 and 1.6 percentage points higher than that of YOLOv7, respectively; The mAP of the improved YOLOv7 increased by 1.8, 1.4, and 1.5 percentage points, respectively, in the ground, sky, and pear blossom backgrounds, compared with the YOLOv7 model. The high accuracy of detection was achieved to recognize the pear blossom in complex environments with varying backgrounds, distances, occlusions, and lighting conditions. The findings can also provide the data support to accurately identify the pear blossoms under the natural environment in pear orchards.

HTML全文

参考文献(27)

施引文献

资源附件(0)