Loading [MathJax]/jax/output/SVG/jax.js
    • EI
    • CSA
    • CABI
    • 卓越期刊
    • CA
    • Scopus
    • CSCD
    • 核心期刊

基于YOLOv5m和CBAM-CPN的单分蘖水稻表型参数提取

陈慧颖, 宋青峰, 常天根, 郑立华, 朱新广, 张漫, 王敏娟

陈慧颖,宋青峰,常天根,等. 基于YOLOv5m和CBAM-CPN的单分蘖水稻表型参数提取[J]. 农业工程学报,2024,40(2):307-314. DOI: 10.11975/j.issn.1002-6819.202304126
引用本文: 陈慧颖,宋青峰,常天根,等. 基于YOLOv5m和CBAM-CPN的单分蘖水稻表型参数提取[J]. 农业工程学报,2024,40(2):307-314. DOI: 10.11975/j.issn.1002-6819.202304126
CHEN Huiying, SONG Qingfeng, CHANG Tiangen, et al. Extraction of the single-tiller rice phenotypic parameters based on YOLOv5m and CBAM-CPN[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2024, 40(2): 307-314. DOI: 10.11975/j.issn.1002-6819.202304126
Citation: CHEN Huiying, SONG Qingfeng, CHANG Tiangen, et al. Extraction of the single-tiller rice phenotypic parameters based on YOLOv5m and CBAM-CPN[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2024, 40(2): 307-314. DOI: 10.11975/j.issn.1002-6819.202304126

基于YOLOv5m和CBAM-CPN的单分蘖水稻表型参数提取

基金项目: 国家自然科学基金项目(32201654);国家重点研发计划项目(2022YFD1900701)
详细信息
    作者简介:

    陈慧颖,研究方向为计算机视觉技术。Email:chenhuiying3171@163.com

    通讯作者:

    王敏娟,副教授,博士生导师,研究方向为计算机视觉技术。Email:minjuan@cau.edu.cn

  • 中图分类号: S511

Extraction of the single-tiller rice phenotypic parameters based on YOLOv5m and CBAM-CPN

  • 摘要:

    为快速获取单分蘖水稻植株的形态结构和表型参数,该研究提出了一种基于目标检测和关键点检测模型相结合的骨架提取和表型参数获取方法。该方法基于目标检测模型生成穗、茎秆、叶片的边界框和类别,将所得数据分别输入到关键点检测模型检测各部位关键点,按照语义信息依次连接关键点形成植株骨架,依据关键点坐标计算穗长度、茎秆长度、叶片长度、叶片-茎秆夹角4种表型参数。首先,构建单分蘖水稻的关键点检测和目标检测数据集;其次,训练Faster R-CNN、YOLOv3、YOLOv5s、YOLOv5m目标检测模型,经过对比,YOLOv5m的检测效果最好,平均精度均值(mean average precision,mAP)达到91.17%;然后,应用人体姿态估计的级联金字塔网络(cascaded pyramid network,CPN)提取植株骨架,并引入注意力机制CBAM(convolutional block attention module)进行改进,与沙漏网络(hourglass networks,HN)、堆叠沙漏网络模型(stacked hourglass networks,SHN)和CPN模型相比,CBAM-CPN模型的预测准确率分别提高了9.68、8.83和1.06个百分点,达到94.75%,4种表型参数的均方根误差分别为1.06 cm、0.81 cm、1.25 cm和2.94°。最后,结合YOLOv5m和CBAM-CPN进行预测,4种表型参数的均方根误差分别为1.48、1.05、1.74 cm和2.39°,与SHN模型相比,误差分别减小1.65 cm、3.43 cm、2.65 cm和4.75°,生成的骨架基本能够拟合单分蘖水稻植株的形态结构。所提方法可以提高单分蘖水稻植株的关键点检测准确率,更准确地获取植株骨架和表型参数,有助于加快水稻的育种和改良。

    Abstract:

    Rice is one of the most essential grain crops in China, providing an important guarantee for the food supply. The taste and nutrition of rice are ever-increasing with the development of society and the improvement of living standards. Therefore, it is necessary to accelerate the breeding and improvement to ensure the quantity and quality of rice. Among them, skeleton and phenotypic parameters can be used to represent the growth and health status of rice for better breeding and improvement. In this study, the object and key points detection models were used to extract the skeleton for the phenotypic parameters. The images of single-tiller rice were taken as the research object. The bounding box of spikes, stems and leaves was also detected by the object detection model. The predicted key points were connected to form the rice skeleton, according to the semantic information. Four phenotypic parameters were calculated, including spike length, stem length, leaf length, and leaf-stem angle, according to the key point coordinates. Firstly, 1 081 RGB images of single-tiller rice were collected in total. The datasets of single-tiller rice were created for object detection and key points detection. Secondly, four current mainstream object detection models were trained, namely Faster R-CNN, YOLOv3, YOLOv5s, and YOLOv5m. The best detection was achieved in YOLOv5m, with mean average precision (mAP) reaching 91.17%, compared with the rest, the mAP of YOLOv5m was improved by 49.55, 36.38, and 2.69 percentage points, respectively, compared with Faster R-CNN, YOLOv3 and YOLOv5s. The predicted bounding box and category were drawn on the original picture to observe the prediction of the model. The visualization results showed that YOLOv5m was basically detected in the bounding box and category of spikes, stems and leaves. Then, the cascaded pyramid network (CPN) model was used for human pose estimation and then applied to plant skeleton extraction. The attention mechanism squeeze and excitation networks (SENet) and convolutional block attention module (CBAM) were integrated into the backbone to improve the feature extraction ability of the model. By contrast, the key points prediction accuracies of SE-CPN and CBAM-CPN were higher than that of CPN. Furthermore, CBAM-CPN shared the highest prediction accuracy of key points, with accuracy of 95.24%, 95.74%, and 93.27% for spike, stem and leaf, respectively. The average accuracy reached 94.75%. The prediction accuracy of the CBAM-CPN model was improved by 9.68, 8.83, and 1.06 percentage points, respectively, compared with hourglass networks (HN), stacked hourglass networks (SHN) and CPN models. The root mean square errors (RMSE) of the phenotypic parameters were 1.06 cm, 0.81 cm, 1.25 cm, and 2.94° respectively. Lastly, the RMSE of four phenotypic parameters were 1.48 cm, 1.05 cm, 1.74 cm and 2.39°, combined with YOLOv5m and CBAM-CPN. The errors were reduced by 1.65 cm, 3.43 cm, 2.65 cm and 4.75°, respectively, compared with SHN. The better prediction was achieved in the improved model. Moreover, the formed skeleton can be expected to better fit the morphological structure of single-tiller rice. The feasibility of the improved model was further verified to combine the object and key points detection model, in order to extract the skeleton and phenotypic parameters of the single-tiller rice. In conclusion, higher detection accuracy was achieved in the key points of single-tiller rice plants. The skeleton and phenotypic parameters were extracted more efficiently and accurately. The findings can provide a strong reference to accelerate the breeding and improvement of rice.

  • 水稻是重要的粮食作物之一, 是全国粮食供给的重要保障[1]。随着社会进步和居民生活水平的提高,人们更加注重水稻的口感和营养[2]。因此需要加快水稻的育种和改良,以保障稻谷的产量和质量。

    骨架提取可以去除背景复杂冗余信息,一定程度上排除背景的影响,直观反映植株的形态特征[3];植株表型大体分为整体表型和组成成分表型[4],通过研究表型,可以了解植株的基因型与环境相互作用产生的植株物理、生理、生化特征和性状[5-6],可以反映植株生长及健康状况,是精准设计育种和作物生产精准管理的重要技术支撑[7]。刁智华等[8]通过实现玉米作物行骨架提取,满足了玉米对行精准管理的需求。宋晨旭等[9]通过提取大豆籽粒个数、籽粒面积、籽粒残缺情况等表型参数,为大豆籽粒自动化考种提供了参考依据。因此,快速准确地获取骨架和表型数据对于水稻的育种和改良有重要的指导意义。

    传统骨架提取算法对高分辨率图片的提取效率低、适用性差。而手工测量获取水稻表型参数的方式成本高、效率低、易受主观因素影响。随着计算机技术的发展,深度学习发展迅速,为设计较优的骨架提取和表型参数计算方法提供了技术支撑。

    关键点可以表征物体的重要特征,其位置和数量因具体任务和应用场景而有所不同。通过检测和分析关键点,可以实现特定的任务。从图像处理到引入深度学习模型,关键点检测算法不断创新[10]。于乃功等[11]以双目立体视觉测量为研究背景,提出了基于色彩空间阈值分割的关键点检测定位算法,该算法通过颜色阈值分割、霍夫变换提取线段、最小二乘法等计算像素坐标。而基于深度学习的关键点检测主要应用于人体姿态估计。TOSHEV等[12]首次提出使用深度学习网络检测人体关键点。NEWELL等[13]提出了经典单人关键点检测模型堆叠沙漏网络模型(stacked hourglass networks,SHN)。CHEN等[14]提出了多人检测模型级联金字塔网络模型(cascaded pyramid network,CPN)。本文将人体姿态估计的关键点检测模型应用到植株关键点检测。虽然人体姿态估计和植株骨架获取有一定的相似性,但是水稻植株存在形态特征不统一的问题,比如水稻是否有穗以及叶片数目无法统一,这导致了关键点个数的不一致。针对这一问题,王敏娟等[15]引用沙漏网络(hourglass networks,HN)和堆叠沙漏网络模型(stacked hourglass networks,SHN),设置缺失点为不可见点,这一方法可能会导致某一位置的关键点特征提取不充分。为解决该问题,本文将引入目标检测模型识别水稻的穗、茎秆和叶片,对三者分别进行关键点识别。基于深度学习的目标检测算法可以分为两类:两阶段和一阶段。两阶段的经典模型有R-CNN[16]、Fast R-CNN[17]、Faster R-CNN[18]、Mask RCNN[19]等。一阶段的代表模型有YOLO系列[20-22]等。目前,目标检测已在作物表型获取[23-24]和病虫害检测[25-26]等领域得到了广泛的应用。

    注意力机制因其即插即用的特性,在深度学习中得到了广泛的应用。在特征提取网络中添加注意力机制可以使模型自适应的以加权的方式强化重要信息同时抑制非重要信息,突出重要特征以提升网络的特征提取能力。朱德利等[27]在YOLOX-s的基础上嵌入坐标注意力机制,提升了对玉米花丝的检测准确度。胡广锐等[28]将高效通道注意力和混洗注意力机制引入到YOLOX-Tiny模型,有效增强网络对苹果的定位能力。

    综上,本文提出一种基于目标检测和关键点检测相结合的水稻单分蘖骨架和表型参数获取方法,基于目标检测模型生成穗、茎秆、叶片的边界框和类别,将生成的数据分别输入到关键点检测模型检测目标的关键点,按照特定的语义信息依次连接关键点形成植株骨架,依据关键点坐标计算穗长度、茎秆长度、叶片长度和叶片-茎秆夹角4种表型参数。通过对比多种目标检测模型的试验结果,选择准确率最高的模型展开效果试验,对比添加不同注意力机制的关键点检测模型,实现单分蘖水稻植株骨架和表型参数的获取。

    单分蘖水稻图像的采集地点为中科院分子植物科学卓越中心松江实验基地(121°8′E,30°56′N)。共采集1081张抽穗期RGB图像。数据采集分3个批次,分别拍摄394、331、356张水稻图像:以150 cm长的黑色背景板为参照物,将水稻自然平铺在背景板上,3个批次数据集的像素点和实际尺寸的比例系数分别为35.185、34.043、32.132像素/cm。该比例系数用于计算表型参数。

    借助Python脚本对原始图像进行裁剪,使得水稻植株四周各保留20像素,进一步排除背景中复杂环境的影响。裁剪效果如图1所示。

    图  1  图像预处理
    Figure  1.  Image preprocessing

    将数据集按照8:2分为训练集865张图片,测试集216张图片。使用labelme工具标注穗、茎秆、叶片的bbox边界框,若图像中有多个叶片,则以L1、L2等标号区分。叶片在茎秆上的生长位置距离茎秆根部最近的记为L1,次之的记为L2,以此类推。经过统计,训练集共标注765个穗、865个茎秆、4210个叶片;测试集共标注189个穗、216个茎秆、1071个叶片。

    已有研究[15]分析了关键点个数为2、3、5时的骨架提取效果,选取2或3个关键点时虽然关键点识别的准确率较高,但是针对叶片弯曲的情况,骨架拟合效果不佳,选取5个关键点时基本能够很好地拟合出水稻的骨架,减小了由于叶片弯曲而产生的误差,而过多的关键点个数可能会导致模型的实时性下降,因此本文选取5个关键点展开研究。

    数据集的训练集和测试集划分方法与目标检测数据集一致,在已框选出目标边界框的基础上,使用labelme工具分别在穗、茎秆、叶片上标注5个从首到尾、均匀分布的关键点。由于穗、茎秆、叶片的类别不同,提取出的特征也存在差异,因此,训练模型时将分别制作穗、茎秆、叶片的数据集。

    1)目标检测数据增强方法

    为增强网络模型的鲁棒性和泛化能力,4种目标检测模型采用水平翻转、随机缩放、色域变换、随机模糊等数据增强方法。特别地,YOLOv5模型采用了Mosaic数据增强,随机对4张图片进行拼接,将拼接后的图片随机缩放或者平移得到新的图片进行模型训练。

    2)关键点检测数据增强方法

    模型的输入尺寸为384×288(高×宽),为统一输入尺寸,图像在输入模型前需要进行裁剪,调整大小到384×288。

    为丰富样本量,在开始训练网络前,增加基础的数据增强操作,包括水平翻转、随机角度的旋转等。为避免人为标定边界框不完整所造成的预测误差,按照一定的比例扩大边界框。考虑实际应用中拍摄模糊和光照的问题,增加调整对比度和高斯模糊2种数据增强方法。数据增强效果如图2所示。

    图  2  数据增强效果
    Figure  2.  Effects of data enhancement

    级联金字塔网络模型[14]是一种自上而下的关键点检测模型,需要先检测到边界框,然后再对目标进行关键点检测。

    CPN模型包括GlobalNet和RefineNet两个模块。GlobalNet采用ResNet网络提取特征,ResNet每一层特征图经过卷积生成关键点概率分布图(Heatmap),采用FPN(feature pyramid networks)网络实现不同层次特征图的信息融合。RefineNet接收GlobalNet的多层次特征图,加入不同数量的Bottleneck模块以提取更深层的特征,通过concat层把所有信息融合,得到预测结果。为了增强ResNet特征提取网络的表达能力,在ResNet中嵌入轻量级注意力机制,使模型以加权的方式强化重要信息,如图3中主干网络部分中的CBAM(convolutional block attention module)[29]模块。

    图  3  CBAM-CPN整体结构
    注:Conv1、Conv2_x、Conv3_x、Conv4_x、Conv5_x表示ResNet的层名称,Bottleneck表示瓶颈块,Heatmap表示热图,Upsampling i (i=2, 4, 8) times 表示上采样i次。
    Figure  3.  Overall structure of CBAM-CPN (convolutional block attention module-cascaded pyramid network)
    Note: Conv1、Conv2_x、Conv3_x、Conv4_x、Conv5_x represent the layer name of ResNet, Bottleneck represents the bottleneck block, Heatmap represents heat map, Upsampling i (i=2, 4, 8) times represents upsampling i times.

    注意力机制可以使网络模型突出某些重要特征,提高特征提取的能力,从而提高模型的检测精度。

    本文在ResNet的残差模块中加入即插即用的SENet(squeeze and excitation networks)模块[30]和CBAM模块,形成SE-ResNet和CBAM-ResNet特征提取网络。SENet是一种通道类型的注意力机制,即在通道维度上增加注意力机制。经过训练学习,该模块为每个特征通道的注意力分配一个权重,使得每个特征通道的重要性不一样,从而让卷积神经网络重点关注权重值较大的通道,抑制作用不大的特征通道。

    CBAM由通道注意力(channel attention module,CAM)和空间注意力(spatial attention module,SAM)两部分组成。CBAM在残差模块中的位置如图4所示。在通道注意力模块,首先将特征图进行最大池化和平均池化,然后经过多层感知机进行学习,将学习后的特征进行叠加,最后经过激活函数得到通道注意力权重。在空间注意力模块,首先将输入特征进行最大池化和平均池化,将得到的特征图进行拼接,再进行卷积操作,最后经过激活函数得到空间注意力权重。与SENet模块相比,该模块不仅为每个特征通道分配权重,还为每个特征位置分配权重,提高特征提取的效果。

    图  4  加入CBAM的残差模块
    注:Residual表示残差模块,MaxPool表示最大池化操作,AvgPool表示平均池化操作,MLP表示多层感知机,Concat表示特征融合操作,Conv表示卷积操作。
    Figure  4.  Residual module with CBAM (convolutional block attention module)
    Note: Residual represents residual module, MaxPool represents max pooling operation, AvgPool represents average pooling operation, MLP represents multi-layer perceptron, Concat represents the feature map fusion operation, Conv represents convolution.

    本文将根据真实关键点坐标、模型预测的关键点坐标、参照物像素点和真实空间的比例(单位:像素/cm),计算穗长度、茎秆长度、叶片长度、叶片-茎秆夹角4种表型参数,表型参数示意图如图5所示。并计算4种表型参数的均方根误差,对关键点检测模型的预测结果进行评估。

    图  5  表型参数示意图
    注:a为穗长度,cm;b为茎秆长度,cm;c为叶片长度,cm;d为叶片-茎秆夹角,(°)。
    Figure  5.  Schematic diagram of phenotypic parameters
    Note: a is the spike length, cm; b is the stem length, cm; c is the leaf length, cm; d is the leaf-stem angle, (°).

    穗长度、茎秆长度、叶片长度的计算为式(1)。

    L=n1i=1(xixi+1)2+(yiyi+1)2 (1)

    式中L表示穗、茎秆或叶片的长度,cm;n表示每个穗、茎秆或叶片上的关键点,n=5;xi表示第i个点的横坐标;yi表示第i个点的纵坐标。

    叶片-茎秆夹角定义为叶片在茎节点处的切线与茎秆的夹角,规定叶片第1、2个关键点的连线所在直线为叶片在茎节点处的切线,如果叶片第1、2个关键点的连线所在直线与茎秆上连续2个关键点连线的交点坐标在这两个关键点之间,则把这2个关键点的连线视为茎秆所在直线,计算方法见式(2)。

    θ=arctanm2m11+m1m2 (2)

    式中θ表示叶片-茎秆夹角,(°);m1m2分别表示叶片在茎节点处的切线斜率和茎秆的斜率。

    本试验在配置有NVIDIA Geforce RTX 3090的服务器上进行。4种模型均基于Pytorch框架进行训练。

    Faster RCNN模型的输入尺寸设定为600×600,训练过程分为冻结阶段和解冻阶段,冻结阶段数据批次大小为16,迭代次数为50;解冻阶段数据批次大小为8,迭代次数为350。采用SGD优化器,初始学习率为0.01。YOLOv3模型的输入尺寸设定为640×640,训练过程分为冻结阶段和解冻阶段,冻结阶段数据批次大小为16,迭代次数为50;解冻阶段数据批次大小为8,迭代次数为350。采用SGD优化器,初始学习率为0.01。YOLOv5s和YOLOv5m模型的输入尺寸设定为640×640。数据批次大小为32,迭代次数为400。采用SGD优化器,学习率设置为0.01。

    采用平均精度(average precision,AP)和平均精度均值(mean average precision,mAP)评估模型性能。设置交并比(intersection over union,IOU)为0.75,模型预测结果如表1所示。可以看出,Faster R-CNN和YOLOv3对穗和茎秆的检测能力不足,平均精度较低,YOLOv5s和YOLOv5m对穗、茎秆和叶片的平均检测精度均在80%以上。与Faster R-CNN、YOLOv3、YOLOv5s相比,YOLOv5m的检测精度分别提升49.55、36.38和2.69个百分点,达到 91.17%。

    表  1  各种模型目标检测结果对比
    Table  1.  Comparison the object detection results of various models
    模型Models 类别Classes AP/% mAP/%
    Faster R-CNN 37.92 41.62
    茎秆 14.29
    叶片 72.64
    YOLOv3 52.74 54.79
    茎秆 28.77
    叶片 82.87
    YOLOv5s 84.19 88.48
    茎秆 86.23
    叶片 95.02
    YOLOv5m 87.84 91.17
    茎秆 90.14
    叶片 95.54
    注:AP代表IOU阈值为0.75时的平均精度。mAP代表3种类别的AP均值。
    Note: AP is the average precision for IOU (intersection over union) threshold of 0.75. mAP is the average AP of the three categories.
    下载: 导出CSV 
    | 显示表格

    将预测出的边界框绘制在原始图像上,观察模型的预测效果。从图6可视化结果可以看出,Faster R-CNN模型容易出现漏检、误检以及检测框偏差较大的情况;YOLOv3模型对于茎秆的检测不准确,出现了预测范围过大的现象;YOLOv5s和YOLOv5m表现出了良好的检测效果,基本能够准确的识别出穗、茎秆、叶片的位置和类别。

    图  6  不同模型的检测结果对比
    Figure  6.  Comparison of detection results with various models

    综上,选择YOLOv5m作为目标检测阶段的模型,为关键点检测模型提供穗、茎秆、叶片的相关数据。

    本试验在配置有NVIDIA Geforce RTX 3090的服务器上进行。模型基于Pytorch框架进行训练。分别选用ResNet50和添加两种注意力机制的ResNet50作为主干网络进行试验,模型的输入尺寸设定为384×288,特征图的输出尺寸设定为96×72。采用Adam优化器,初始学习率为0.0005。损失函数采用均方误差(mean square error,MSE)。数据批次大小为16,迭代次数为70。

    采用关键点正确估计比例(percentage of correct keypoints,PCK)评价模型的准确率。将预测关键点与真实关键点的欧氏距离归一化,归一化值小于设定阈值的比例即为准确率。模型输出还包括满足二维高斯分布的关键点概率分布图(heatmap)。分别训练穗、茎秆、叶片的数据集,保存训练好的模型参数进行测试。关键点预测准确率如表2所示。可以看出,添加注意力机制的CPN模型比原CPN模型对水稻植株关键点的预测准确率均有不同程度的提升,其中,CBAM-CPN模型最优,对比HN、SHN和CPN模型,CBAM-CPN模型的预测准确率分别提高9.68、8.83和1.06个百分点,进一步验证了穗、茎秆和叶片分别提取的有效性。

    表  2  关键点预测准确率
    Table  2.  Key point prediction accuracy
    模型Models 穗Spike/% 茎秆Stem/% 叶片Leaf/% 准确率Accuracy/%
    HN 85.07
    SHN 85.92
    CPN 93.97 95.04 92.05 93.69
    SE-CPN 94.39 95.50 93.27 94.39
    CBAM-CPN 95.24 95.74 93.27 94.75
    下载: 导出CSV 
    | 显示表格

    CBAM-CPN模型预测结果的热图如图7所示。预测可能为关键点的较高概率点基本都在水稻植株上。

    图  7  单分蘖水稻植株关键点热图
    Figure  7.  Heatmap of key points for single-tiller rice

    整合CBAM-CPN模型的预测结果,把关键点绘制在水稻图像上。图8为水稻植株关键点的真实值与预测值的位置对比。从可视化结果可以看出,中间位置的部分预测关键点相比真实值存在偏移,但仍在水稻植株上。按照语义顺序依次连接每个穗、茎秆、叶片的5个关键点,形成植株骨架。根据形成的骨架图9来看,模型对单分蘖水稻骨架的提取效果较好。

    图  8  真实关键点和预测关键点位置对比
    注:绿色表示真实关键点,红色表示预测关键点。
    Figure  8.  Comparison of actual key points and predicted key points
    Note:Green indicates actual key points and red indicates predicted key points.
    图  9  单分蘖水稻植株的骨架提取结果
    Figure  9.  Skeleton extraction results of single-tiller rice

    根据1.3节方法计算4种表型参数,表3是预测表型参数与实际表型参数的均方根误差。可以看出,相比于原CPN模型,SE-CPN和CBAM-CPN模型预测的表型参数均方根误差都有所降低,其中,CBAM-CPN模型在单分蘖水稻的表型参数预测上效果最好。

    表  3  基于CBAM-CPN的单分蘖水稻表型参数均方根误差
    Table  3.  RMSE (root mean square errors) of phenotypic parameters of single-tiller rice based on CBAM-CPN
    表型参数Phenotypic parameters CPN SE-CPN CBAM-CPN
    穗长Spike length/cm 1.38 1.14 1.06
    茎秆长Stem length/cm 1.04 1.08 0.81
    叶片长Leaf length/cm 1.51 1.50 1.25
    叶片-茎秆夹角Leaf-Stem angle/(°) 3.31 3.18 2.94
    下载: 导出CSV 
    | 显示表格

    以YOLOv5m模型识别的穗、茎秆和叶片数据作为CBAM-CPN模型的输入。表4为基于关键点坐标计算得到的4种表型参数的均方根误差。YOLOv5m+CBAM-CPN的预测效果较好,各参数的均方根误差与HN相比分别减小1.77 cm、3.19 cm、5.22 cm和5.27°;与SHN相比分别减小1.65 cm、3.43 cm、2.65 cm和4.75°。图10为不同模型的水稻植株骨架提取结果。可以看出,HN和SHN的预测效果相近,由于叶片数目不一致,不可见点较多,部分叶片关键点特征提取不够充分,而YOLOv5m和CBAM-CPN结合可以解决这一问题,提取的植株骨架基本能够拟合水稻真实形态结构。

    表  4  不同模型单分蘖水稻表型参数均方根误差
    Table  4.  RMSE (root mean square errors) of phenotypic parameters of single-tiller rice based on various models
    表型参数Phenotypic parameters HN SHN YOLOv5m+CBAM-CPN
    穗长Spike length/cm 3.25 3.13 1.48
    茎秆长Stem length/cm 4.24 4.48 1.05
    叶片长Leaf length/cm 6.96 4.39 1.74
    叶片-茎秆夹角Leaf-Stem angle/(°) 7.66 7.14 2.39
    下载: 导出CSV 
    | 显示表格
    图  10  不同模型的单分蘖水稻植株骨架提取结果
    Figure  10.  Skeleton extraction results of single-tiller rice based on various models

    为快速准确获取单分蘖水稻植株的骨架和表型参数,本文提出了一种基于YOLOv5m和CBAM-CPN的骨架获取和表型参数计算方法,主要结论如下:

    1)分别训练Faster R-CNN、YOLOv3、YOLOv5s和YOLOv5m目标检测模型,识别单分蘖水稻的穗、茎秆、叶片,YOLOv5m的平均精度均值(mean average precision,mAP)最高,达到91.17%,与Faster R-CNN、YOLOv3、YOLOv5s相比,YOLOv5m的检测精度分别提升49.55、36.38和2.69个百分点。可视化结果表明该模型基本能够准确识别出穗、茎秆、叶片的位置和类别。

    2)在级联金字塔网络模型(cascaded pyramid network,CPN)的基础上,引入注意力机制CBAM(convolutional block attention module)。对比沙漏网络模型(hourglass networks,HN)、堆叠沙漏网络模型(stacked hourglass networks,SHN)和CPN模型,CBAM-CPN模型的预测准确率分别提高了9.68、8.83和1.06个百分点,达到94.75%,在穗、茎秆、叶片上的准确率分别达到了95.24%、95.74%、93.27%,穗长、茎秆长、叶片长和叶片-茎秆夹角4种表型参数的均方根误差分别为1.06 cm,0.81 cm,1.25 cm和2.94°。

    3)结合上述YOLOv5m和CBAM-CPN模型,获取单分蘖水稻植株骨架和表型参数,穗长、茎秆长、叶片长和叶片-茎秆夹角4种表型参数的均方根误差分别为1.48 cm,1.05 cm,1.74 cm和2.39°,与SHN相比,误差分别减小1.65 cm,3.43 cm,2.65 cm和4.75°,预测效果较好,按照语义信息连接关键点形成的骨架基本能够较好地拟合单分蘖水稻的形态结构,能够准确提取水稻骨架和部分表型参数,有利于加快水稻的育种和改良。

  • 图  1   图像预处理

    Figure  1.   Image preprocessing

    图  2   数据增强效果

    Figure  2.   Effects of data enhancement

    图  3   CBAM-CPN整体结构

    注:Conv1、Conv2_x、Conv3_x、Conv4_x、Conv5_x表示ResNet的层名称,Bottleneck表示瓶颈块,Heatmap表示热图,Upsampling i (i=2, 4, 8) times 表示上采样i次。

    Figure  3.   Overall structure of CBAM-CPN (convolutional block attention module-cascaded pyramid network)

    Note: Conv1、Conv2_x、Conv3_x、Conv4_x、Conv5_x represent the layer name of ResNet, Bottleneck represents the bottleneck block, Heatmap represents heat map, Upsampling i (i=2, 4, 8) times represents upsampling i times.

    图  4   加入CBAM的残差模块

    注:Residual表示残差模块,MaxPool表示最大池化操作,AvgPool表示平均池化操作,MLP表示多层感知机,Concat表示特征融合操作,Conv表示卷积操作。

    Figure  4.   Residual module with CBAM (convolutional block attention module)

    Note: Residual represents residual module, MaxPool represents max pooling operation, AvgPool represents average pooling operation, MLP represents multi-layer perceptron, Concat represents the feature map fusion operation, Conv represents convolution.

    图  5   表型参数示意图

    注:a为穗长度,cm;b为茎秆长度,cm;c为叶片长度,cm;d为叶片-茎秆夹角,(°)。

    Figure  5.   Schematic diagram of phenotypic parameters

    Note: a is the spike length, cm; b is the stem length, cm; c is the leaf length, cm; d is the leaf-stem angle, (°).

    图  6   不同模型的检测结果对比

    Figure  6.   Comparison of detection results with various models

    图  7   单分蘖水稻植株关键点热图

    Figure  7.   Heatmap of key points for single-tiller rice

    图  8   真实关键点和预测关键点位置对比

    注:绿色表示真实关键点,红色表示预测关键点。

    Figure  8.   Comparison of actual key points and predicted key points

    Note:Green indicates actual key points and red indicates predicted key points.

    图  9   单分蘖水稻植株的骨架提取结果

    Figure  9.   Skeleton extraction results of single-tiller rice

    图  10   不同模型的单分蘖水稻植株骨架提取结果

    Figure  10.   Skeleton extraction results of single-tiller rice based on various models

    表  1   各种模型目标检测结果对比

    Table  1   Comparison the object detection results of various models

    模型Models 类别Classes AP/% mAP/%
    Faster R-CNN 37.92 41.62
    茎秆 14.29
    叶片 72.64
    YOLOv3 52.74 54.79
    茎秆 28.77
    叶片 82.87
    YOLOv5s 84.19 88.48
    茎秆 86.23
    叶片 95.02
    YOLOv5m 87.84 91.17
    茎秆 90.14
    叶片 95.54
    注:AP代表IOU阈值为0.75时的平均精度。mAP代表3种类别的AP均值。
    Note: AP is the average precision for IOU (intersection over union) threshold of 0.75. mAP is the average AP of the three categories.
    下载: 导出CSV

    表  2   关键点预测准确率

    Table  2   Key point prediction accuracy

    模型Models 穗Spike/% 茎秆Stem/% 叶片Leaf/% 准确率Accuracy/%
    HN 85.07
    SHN 85.92
    CPN 93.97 95.04 92.05 93.69
    SE-CPN 94.39 95.50 93.27 94.39
    CBAM-CPN 95.24 95.74 93.27 94.75
    下载: 导出CSV

    表  3   基于CBAM-CPN的单分蘖水稻表型参数均方根误差

    Table  3   RMSE (root mean square errors) of phenotypic parameters of single-tiller rice based on CBAM-CPN

    表型参数Phenotypic parameters CPN SE-CPN CBAM-CPN
    穗长Spike length/cm 1.38 1.14 1.06
    茎秆长Stem length/cm 1.04 1.08 0.81
    叶片长Leaf length/cm 1.51 1.50 1.25
    叶片-茎秆夹角Leaf-Stem angle/(°) 3.31 3.18 2.94
    下载: 导出CSV

    表  4   不同模型单分蘖水稻表型参数均方根误差

    Table  4   RMSE (root mean square errors) of phenotypic parameters of single-tiller rice based on various models

    表型参数Phenotypic parameters HN SHN YOLOv5m+CBAM-CPN
    穗长Spike length/cm 3.25 3.13 1.48
    茎秆长Stem length/cm 4.24 4.48 1.05
    叶片长Leaf length/cm 6.96 4.39 1.74
    叶片-茎秆夹角Leaf-Stem angle/(°) 7.66 7.14 2.39
    下载: 导出CSV
  • [1] 李建. 促进水稻科技创新和产业发展的专业性期刊[J]. 农业图书情报学刊,2005,17(12):157-160.

    LI Jian. Professional periodicals of promoting rice sci-tech innovation and industrial development[J]. Journal of Library and Information Science in Agriculture, 2005, 17(12): 157-160. (in Chinese with English abstract)

    [2] 徐春春,纪龙,陈中督,等. 2021年我国水稻产业形势分析及2022年展望[J]. 中国稻米,2022,28(2):16-19.

    XU Chunchun, JI Long, CHEN Zhongdu, et al. Analysis of China’s rice industry in 2021 and the outlook for 2022[J]. China Rice, 2022, 28(2): 16-19. (in Chinese with English abstract)

    [3] 方志力,温维亮,郭新宇,等. 基于Kinect的三维玉米植株骨架提取[J]. 系统仿真学报,2017,29(3):524-530.

    FANG Zhili, WEN Weiliang, GUO Xinyu. et al. Skeleton extraction from three-dimensional maize based on Kinect[J]. Journal of System Simulation, 2017, 29(3): 524-530. (in Chinese with English abstract)

    [4]

    DAS CHOUDHURY S, BASHYAM S, QIU Y, et al. Holistic and component plant phenotyping using temporal image sequence[J]. Plant Methods, 2018, 14(1): 1-21. doi: 10.1186/s13007-017-0271-6

    [5] 潘映红. 论植物表型组和植物表型组学的概念与范畴[J]. 作物学报,2015,41(2):175-186. doi: 10.3724/SP.J.1006.2015.00175

    PAN Yinghong. Analysis of concepts and categories of plant phenome and phenomics[J]. Acta Agronomica Sinica, 2015, 41(2): 175-186. (in Chinese with English abstract) doi: 10.3724/SP.J.1006.2015.00175

    [6]

    DEVAKI P, ARUNACHALAM P, SANKAR K S A, et al. A deep learning approach for yield estimation and phenotype analysis in rice crops[C]//2021 International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA). IEEE, Coimbatore, 2021: 1-6.

    [7] 岑海燕,朱月明,孙大伟,等. 深度学习在植物表型研究中的应用现状与展望[J]. 农业工程学报,2020,36(9):1-16.

    CEN Haiyan, ZHU Yueming, SUN Dawei, et al. Current status and future perspective of the application of deep learning in plant phenotype research[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(9): 1-16. (in Chinese with English abstract)

    [8] 刁智华,吴贝贝,毋媛媛,等. 基于最大正方形的玉米作物行骨架提取算法[J]. 农业工程学报,2015,31(23):168-172. doi: 10.11975/j.issn.1002-6819.2015.23.022

    DIAO Zhihua, WU Beibei, WU Yuanyuan, et al. Skeleton extraction algorithm of corn crop rows based on maximum square[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(23): 168-172. (in Chinese with English abstract) doi: 10.11975/j.issn.1002-6819.2015.23.022

    [9] 宋晨旭,于翀宇,邢永超,等. 基于OpenCV的大豆籽粒多表型参数获取算法[J]. 农业工程学报,2022,38(20):156-163.

    SONG Chenxu, YU Chongyu, XING Yongchao, et al. Algorith for acquiring multi-phenotype parameters of soybean seed based on OpenCV[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(20): 156-163. (in Chinese with English abstract)

    [10] 史小强,黄钢,苏可怡. 二维人体关键点检测算法综述[J]. 软件工程,2023,26(6):6-10. doi: 10.19644/j.cnki.issn2096-1472.2023.006.002

    SHI Xiaoqiang, HUANG Gang, SU Keyi. Overview of key point detection algorithms for 2D human body[J]. Software Engineering, 2023, 26(6): 6-10. (in Chinese with English abstract) doi: 10.19644/j.cnki.issn2096-1472.2023.006.002

    [11] 于乃功,马春燕,林佳. 基于双目视觉的关键点的检测方法及定位研究[J]. 计算机测量与控制,2011,19(7):1565-1568.

    YU Naigong, MA Chunyan, LIN Jia. Study on key point detection and localization based on binocular vision[J]. Computer Measurement & Control, 2011, 19(7): 1565-1568. (in Chinese with English abstract)

    [12]

    TOSHEV A, SZEGEDY C. Deeppose: Human pose estimation via deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014: 1653-1660.

    [13]

    NEWELL A, YANG K, DENG J. Stacked hourglass networks for human pose estimation[C]//European Conference on Computer Vision. Cham: Springer, 2016: 483-499.

    [14]

    CHEN Y, WANG Z, PENG Y, et al. Cascaded pyramid network for multi-person pose estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018: 7103-7112.

    [15] 王敏娟,刘小丫,马啸霄,等. 基于堆叠沙漏网络的单分蘖水稻植株骨架提取[J]. 农业工程学报,2021,37(24):149-157. doi: 10.11975/j.issn.1002-6819.2021.24.017

    WANG Minjuan, LIU Xiaoya, MA Xiaoxiao, et al. Skeleton extraction method of single tillers rice based on stacked hourglass network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(24): 149-157. (in Chinese with English abstract) doi: 10.11975/j.issn.1002-6819.2021.24.017

    [16]

    GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014: 580-587.

    [17]

    GIRSHICK R. Fast r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision, Santiago, 2015: 1440-1448.

    [18]

    REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.

    [19]

    HE K, GKIOXARI G, DOLLAR P, et al. Mask r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision, Venice, 2017: 2961-2969.

    [20]

    REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016: 779-788.

    [21]

    REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 7263-7271.

    [22]

    REDMON J, FARHADI A. Yolov3: An incremental improvement[EB/OL]. https://arxiv.org/abs/1804.02767, 2018-04-08.

    [23]

    LI L, HASSAN M A, YANG S, et al. Development of image-based wheat spike counter through a Faster R-CNN algorithm and application for genetic studies[J]. The Crop Journal, 2022, 10(5): 1303-1311. doi: 10.1016/j.cj.2022.07.007

    [24] 汪斌斌,杨贵军,杨浩,等. 基于YOLO_X和迁移学习的无人机影像玉米雄穗检测[J]. 农业工程学报,2022,38(15):53-62. doi: 10.11975/j.issn.1002-6819.2022.15.006

    WANG Binbin, YANG Guijun, YANG Hao, et al. UAV images for detecting maize tassel based on YOLO_X and transfer learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(15): 53-62. (in Chinese with English abstract) doi: 10.11975/j.issn.1002-6819.2022.15.006

    [25]

    KUMAR D, KUKREJA V. Image-based wheat mosaic virus detection with Mask-RCNN model[C]//2022 International Conference on Decision Aid Sciences and Applications (DASA). IEEE, Chiangrai, 2022: 178-182.

    [26] 姚青,吴叔珍,蒯乃阳,等. 基于改进CornerNet的水稻灯诱飞虱自动检测方法构建与验证[J]. 农业工程学报,2021,37(7):183-189. doi: 10.11975/j.issn.1002-6819.2021.07.022

    YAO Qing, WU Shuzhen, KUAI Naiyang, et al. Automatic detection of rice planthoppers through light-trap insect images using improved CornerNet[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(7): 183-189. (in Chinese with English abstract) doi: 10.11975/j.issn.1002-6819.2021.07.022

    [27] 朱德利,文瑞,熊俊逸. 融合坐标注意力机制的轻量级玉米花丝检测[J]. 农业工程学报,2023,39(3):145-153.

    ZHU Deli, WEN Rui, XIONG Junyi. Lightweight corn silk detection network incorporating with coordinate attention mechanism[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2023, 39(3): 145-153. (in Chinese with English abstract)

    [28] 胡广锐,周建国,陈超,等. 融合轻量化网络与注意力机制的果园环境下苹果检测方法[J]. 农业工程学报,2022,38(19):131-142.

    HU Guangrui, ZHOU Jianguo, CHEN Chao, et al. Fusion of the lightweight network and visual attention mechanism to detect apples in orchard environment[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(19): 131-142. (in Chinese with English abstract)

    [29]

    WOO S, PARK J, LEE J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV), Munich, 2018: 3-19.

    [30]

    HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018: 7132-7141.

  • 期刊类型引用(3)

    1. 王小卉,李绪孟,唐启源,邹丹,罗友谊,李可夫,彭剑,李灿,曹乐平. 水稻群体分蘖动态模型构建与应用. 农业工程学报. 2024(10): 213-221 . 本站查看
    2. 苑朝,马嘉宁,张盼浩,赵明雪,王家豪,王静娴,徐大伟. 基于改进YOLOv8的蝴蝶兰组培苗视觉伺服种植平台设计与试验. 农业工程学报. 2024(20): 138-146 . 本站查看
    3. 杨肖委,沈强,罗金龙,张拓,杨婷,戴宇樵,刘忠英,李琴,王家伦. 基于改进YOLOv8n的茶树嫩芽识别. 茶叶科学. 2024(06): 949-959 . 百度学术

    其他类型引用(3)

图(10)  /  表(4)
计量
  • 文章访问数:  321
  • HTML全文浏览量:  28
  • PDF下载量:  145
  • 被引次数: 6
出版历程
  • 收稿日期:  2023-04-16
  • 修回日期:  2023-10-25
  • 网络出版日期:  2024-03-18
  • 刊出日期:  2024-01-30

目录

/

返回文章
返回