Tomato detection method using domain adaptive learning for dense planting environments
-
Graphical Abstract
-
Abstract
This study aimed to address the challenge of accurately and reliably detecting tomatoes in dense planting environments, a critical prerequisite for the automation implementation of robotic harvesting. However, the heavy reliance on extensive manually annotated datasets for training deep learning models still poses significant limitations to their application in real-world agricultural production environments. To overcome these limitations, we employed domain adaptive learning approach combined with the YOLOv5 model to develop a novel tomato detection model called as TDA-YOLO (tomato detection domain adaptation). We designated the normal illumination scenes in dense planting environments as the source domain and utilized various other illumination scenes as the target domain. To construct bridge mechanism between source and target domains, neural preset for color style transfer is introduced to generate a pseudo-dataset, which served to deal with domain discrepancy. Furthermore, this study combines the semi-supervised learning method to enable the model to extract domain-invariant features more fully, and uses knowledge distillation to improve the model's ability to adapt to the target domain. Additionally, for purpose of promoting inference speed and low computational demand, the lightweight FasterNet network was integrated into the YOLOv5's C3 module, creating a modified C3_Faster module. The experimental results demonstrated that the proposed TDA-YOLO model significantly outperformed original YOLOv5s model, achieving a mAP (mean average precision) of 96.80% for tomato detection across diverse scenarios in dense planting environments, increasing by 7.19 percentage points; Compared with the latest YOLOv8 and YOLOv9, it is also 2.17 and 1.19 percentage points higher, respectively. The model's average detection time per image was an impressive 15 milliseconds, with a FLOPs (floating point operations per second) count of 13.8 G. After acceleration processing, the detection accuracy of the TDA-YOLO model on the Jetson Xavier NX development board is 90.95%, the mAP value is 91.35%, and the detection time of each image is 21 ms, which can still meet the requirements of real-time detection of tomatoes in dense planting environment. The experimental results show that the proposed TDA-YOLO model can accurately and quickly detect tomatoes in dense planting environment, and at the same time avoid the use of a large number of annotated data, which provides technical support for the development of automatic harvesting systems for tomatoes and other fruits.
-
-