易诗, 李俊杰, 张鹏, 王丹丹. 基于特征递归融合YOLOv4网络模型的春见柑橘检测与计数[J]. 农业工程学报, 2021, 37(18): 161-169. DOI: 10.11975/j.issn.1002-6819.2021.18.019
    引用本文: 易诗, 李俊杰, 张鹏, 王丹丹. 基于特征递归融合YOLOv4网络模型的春见柑橘检测与计数[J]. 农业工程学报, 2021, 37(18): 161-169. DOI: 10.11975/j.issn.1002-6819.2021.18.019
    Yi Shi, Li Junjie, Zhang Peng, Wang Dandan. Detecting and counting of spring-see citrus using YOLOv4 network model and recursive fusion of features[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(18): 161-169. DOI: 10.11975/j.issn.1002-6819.2021.18.019
    Citation: Yi Shi, Li Junjie, Zhang Peng, Wang Dandan. Detecting and counting of spring-see citrus using YOLOv4 network model and recursive fusion of features[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(18): 161-169. DOI: 10.11975/j.issn.1002-6819.2021.18.019

    基于特征递归融合YOLOv4网络模型的春见柑橘检测与计数

    Detecting and counting of spring-see citrus using YOLOv4 network model and recursive fusion of features

    • 摘要: 春见柑橘个体小、单株果树柑橘密集、柑橘之间的形态与颜色相似度高且易被树叶严重遮挡,这些特点给春见柑橘检测与计数带来了较大困难。该研究以实际春见果园环境中的春见柑橘作为检测与计数对象,提出了一种以春见柑橘为检测目标的基于特征递归融合YOLOv4网络模型(YOLOv4 network model based on recursive fusion of features,FR-YOLOv4)。针对春见柑橘尺寸小的特点,FR-YOLOv4网络模型的主干特征提取网络采用了感受野更小的CSPResNest50网络,降低了小尺寸目标的特征图传不到目标检测器中的可能性;针对春见柑橘被遮挡和密集分布的情况,采用了递归特征金字塔(Recursive Feature Pyramid,RFP)网络来进行特征递归融合,提高了对果园环境下春见柑橘的检测精度。试验结果表明:FR-YOLOv4网络模型对于果园环境中春见柑橘的平均检测精度为94.6%,视频检测帧率为51帧/s。FR-YOLOv4网络模型相比于YOLOv4、单次多框检测器(Single Shot Multi-Box Detector,SSD)、CenterNet和更快速卷积神经网络(Faster- Region-Convolutional Neural Networks,Faster R-CNN)的平均检测精度分别提高了8.9、29.3、14.1和16.2个百分点,视频检测帧率分别比SSD、Faster R-CNN提高了17和33帧/s。FR-YOLOv4网络模型对于实际果园环境中春见柑橘的检测精度高,具备检测实时性,适用于春见果园中春见柑橘检测与计数。

       

      Abstract: Abstract: Automatic fruit picking has widely been popular for the intelligent estimation of orchard economic harvest in the field of smart agriculture. Object detection using deep learning has presented broad application prospects for spring-see citrus detection and counting. But, it is still a very challenging task to detect and count spring-see fruits, due mainly to the small size of spring-see citrus, the high density of fruit on a single spring-see tree, the similar shape and color of fruits, and the tendency to be heavily shaded by foliage. In this study, a YOLOv4 network model was proposed using a recursive fusion of features (FR-YOLOv4) for spring-see citrus detection. CSPResNext50 network with a smaller receptive field was also selected to detect small targets with higher accuracy. As such, the feature extraction in the original YOLOv4 object detection network model was replaced with the CSPResNext50 network, particularly for the small size of spring-see citrus. Therefore, the difficulty was reduced to greatly improve the detection accuracy of small-scale spring-see citrus, where the feature map of small-scale objects was easily transmitted to the object detector. In addition, the Recursive Feature Pyramid (RFP) network was used to replace the original YOLOv4 Path Aggregation Network (PANet), because of the fuzzy and dense distribution of spring-seeing citrus. Correspondingly, RFP networks significantly enhanced the feature extraction and characterization capabilities of the entire YOLOv4 network at a small computational cost. More importantly, the detection accuracy of the YOLOv4 network was improved for spring-see citrus in a real orchard environment. Additionally, a dataset was collected using images and videos of spring-see citrus captured in various environments in spring-see orchards. Subsequently, three data augmentation operations were performed using OpenCV tools to add Gaussian noise, luminance variation and rotation. The dataset was obtained with a total of 3 600 images, of which the training dataset consisted of 2 520 images and the test dataset consisted of 1 080 images. The experimental results on this test dataset showed that the average detection accuracy of FR-YOLOv4 was 94.6% for spring-see citrus in complex orchard environments, and the frame rate of video detection was 51 frames/s. Specifically, the average detection accuracy increased by 8.9 percentage points, whereas, the frame rate of video detection was only 6 frames/s lower, compared with YOLOv4 before the improvement. Consequently, the FR-YOLOv4 presented an average detection accuracy of 29.3, 14.1, and 16.2 percentages higher than that of Single Shot Multi-Box Detector (SSD), CenterNet, and Faster Region-Convolutional Neural Networks (Faster-RCNN), respectively. The frame rate of video detection was 17 and 33 frames/s higher than SSD and Faster-RCNN, respectively. Anyway, the YOLOv4 network model using a recursive fusion of features (FR-YOLOv4) can widely be expected to detect and count spring citrus suitable for actual complex environments in orchards, indicating a higher detection accuracy with real-time performance.

       

    /

    返回文章
    返回