基于改进YOLOv8s的果实与叶片器官分割方法

许楠; 苑迎春; 耿俊; 何振学

doi:10.11975/j.issn.1002-6819.202404002

摘要: 为解决多器官特征识别存在的多器官数据集难以获取以及待处理数据集存在的密集小目标和多尺度目标等问题，该研究提出了一种基于改进YOLOv8s的果实与叶片器官分割方法。该方法以YOLOv8s为基础，在Backbone部分设计了跨阶段局部残差（residual CSPLayer 2Conv，RC2）模块，以拓宽每个网络层的感受野，使网络能充分提取密集的小目标特征。在Neck部分设计了尺度空间金字塔池化（scale spatial pyramid pooling，SSPP）模块，以充分融合网络的高阶多尺度特征信息，增强模型对多尺度目标的检测能力。在Head部分设计了非对称解耦检测头（asymmetric decoupling detection head，ADDH）模块，使模型分类更关注于中心内容，而回归更关注于边缘信息。试验结果表明，在PlantCLEF2022公共数据集中选取的17种果树的数据集中，改进的YOLOv8s模型对果树果实和叶片器官识别的平均精度均值为90.2%，比YOLOv8s模型高6.7个百分点。此外，该研究还将该模型应用到自建的枣数据集上，达到了99.1%的识别准确率，较原模型提升6.6个百分点，证明了所提出方法的通用性，可为常见果树的器官分割与基于多器官特征的细粒度品种分类研究提供参考。

Abstract: Different varieties of the same fruit tree can be classified as a fine granularity. Most them, only a single organ of fruit cannot fully extract the overall feature of the plant. It is also difficult to further improve the accuracy of recognition. Multi-organ features can be expected to recognize the fruit tree in current research. However, the inter-organ features can be interfered with each other in the multi-organ feature recognition. It is still challenging to accurately locate and distinguish different organ features. In this study, an organ segmentation (RSA-YOLOv8s) was proposed for the fruit and leaf using YOLOv8s target detection. The three parts of the model were improved, namely Backbone, Neck, and Head. A cross-stage local residual (Residual CSPLayer 2Conv, RC2) module was also designed in the Backbone part. The sensory field of each network layer was used to construct the hierarchical class residual links in each network layer. The dense and small target features were fully extracted after that. The Scale Spatial Pyramid Pooling (SSPP) module was designed in the Neck part. The spatial and scale information of the network were integrated to introduce 3D convolution. The higher-order and multi-scale features were extracted to detect the multi-scale targets. Furthermore, the Asymmetric decoupling detection head (ADDH) module was designed in the Head part. The regression of the model was realized to classify the features using an asymmetric structure. As such, the classification was focused on the center content, while the regression was the edge information. A total of 950 images of 17 fruit tree varieties were selected from the PlantCLEF2022 public dataset. The experimental results show that the precision rate, recall rate, F1 value and average precision of the RSA-YOLOv8s model were 83.2%, 87.9%, 85.5%, and 90.2%, respectively, in the recognition of fruit and leaf organ, which were 5.6, 6.6, 6.1 and 6.7 percentage points higher than those of the original, respectively; By contrast, the precision rate, recall rate, F1 value and mean precision were 88.2%, 91.7%, 89.9% and 94.3%, respectively, in the recognition of single organ, which were 6.4, 6.6, 6.5 and 7.2 percentage points higher than those of the original, respectively; The precision rate, recall rate, F1 value and the mean precision were 78.2%, 84.1%, 81.0% and 86.1%, respectively, in the recognition of leaf single organ, which were 4.8, 6.6, 5.6 and 6.2 percentage points higher than those of the original, respectively. In addition, the improved model was also applied into the 20 date varieties with a total of 3 881 images. A recognition accuracy of 99.1% was achieved, which was 6.6% higher than that of the original model. The finding can provide a strong reference for the organ segmentation and variety recognition using multi-organ features, particularly for common fruit trees.

基于改进YOLOv8s的果实与叶片器官分割方法

Segmenting fruit and leaf organ using improved YOLOv8s