Hao Jianjun, Bing Zhenkai, Yang Shuhua, Yang Jie, Sun Lei. Detection of green walnut by improved YOLOv3[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(14): 183-190. DOI: 10.11975/j.issn.1002-6819.2022.14.021
    Citation: Hao Jianjun, Bing Zhenkai, Yang Shuhua, Yang Jie, Sun Lei. Detection of green walnut by improved YOLOv3[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(14): 183-190. DOI: 10.11975/j.issn.1002-6819.2022.14.021

    Detection of green walnut by improved YOLOv3

    • Abstract: The aim of this research is to detect green walnuts in the natural environment using machine vision. The yield was also estimated to realize the intelligent management of orchards under fruiting detection. A number of challenges remained to identify the green-skinned walnuts at present, including the more complex environment in which the green walnuts grow, the small difference in color between green walnuts and leaves, as well as the small sizes of the green walnut fruit. An optimization strategy was then proposed to deal with these challenges during the green walnut detection. The images were also accurately captured to determine the walnut yields for the harvest planning. As such, deep learning was effectively used for the object detection of green walnut. A YOLOv3 network was selected to fully meet the actual requirements of real-time detection on the green walnut. An ablation experiment was performed on an improved YOLOv3 object detection algorithm. The quantitative and qualitative analyses were also utilized to verify the model under the optimization strategies. A pre-trained model was substantially improved to detect the green walnuts, compared with the ImageNet DET and COCO datasets. The results showed that the mean Average Precision (mAP) of green walnut detection was 83.55% after the YOLOv3 network training using the pre-trained model with the COCO dataset, and 86.11% for green walnut detection applying the COCO dataset pre-trained model network. The superior detection was achieved in the pre-training model using the COCO dataset for the single category detection. Ablation experiments demonstrated that there was an excellent performance in the green walnut detection using the data enhancement on dataset feature, with a 3.81 percentage points increase in the mAP of the model. The Mixup data enhancement was then added to promote the image data complexity for better detection performance, with a 1.24 percentage point increase in the mAP. The K-means clustering was applied to cluster the annotation boxes for the Anchor scale size in the YOLOv3 model. The Anchor was adjusted to obtain from the clustering in the ablation experiments, indicating the improved mAP of the model. The lightweight MobileNet-v3 network was also selected as the backbone network of object detection, in order to extend to the mobile terminal for the detection of green walnut. As such, the improved model quantification was optimized for less complexity, but higher detection speed. A comparison was made on the performance of YOLOv3-MobileNet-v3 with the YOLOv3-DarkNet-53, YOLOv3-ResNet-50, and Faster RCNN-ResNet-50 target detection networks. The MobileNet-v3 network was achieved in the smallest size of 88.6M, the highest mAP of 94.52%, and the fastest detection speed of 31 frames/s, indicating the best performance on green-skinned walnut detection. Finally, the walnut detection maintained the higher accuracy and detection speed in the case of small models, whereas, the strong robustness in the case of small walnut targets. The finding can provide technical support and yield estimation for intelligent management in walnut orchards. The following ideas were also gained for the small fruit detection near background color. The COCO dataset was recommended to pre-train the model in the single species tasks of target detection. The detection accuracy depended mainly on the data augmentation on the dataset feature, and the Mixup data augmentation. It was necessary to adjust the concentration of data features for the less anchor at the multiple scales. Consequently, the MobileNet-v3 backbone can be presented an excellent performance for an active network during detection. Therefore, the prediction box was simply labelled on the data for the quantitative analysis of the prediction error during detection.
    • loading

    Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return