基于改进RT-Detr的黄瓜果实选择性采摘识别方法

董适; 赵国瑞; 苟豪; 文剑; 林晨

doi:10.11975/j.issn.1002-6819.202408102

摘要: 为了实现光照变化等复杂环境下果实的选择性采摘，该研究以黄瓜为研究对象，以RT-Detr为基线网络，提出了RT-Detr-EV模型。首先在主干网络中添加RepVGG模块，以加强网络特征提取能力，并减少推理时计算量；加入轻量化自注意力机制，减少计算量，增加网络深度；最后使用MPDIoU（minimum point distance based intersection over union）替换原模型中的损失函数，加快模型的收敛，提高模型的检测准确率。研究表明，改进RT-Detr-EV的平均精度均值mAP₅₀相较于原模型提升了3.2个百分点，检测速度相较原模型提升了17.4帧/s。与YOLOv7-X、YOLOv8- l相比，对非适宜采摘的黄瓜识别准确率分别提升4.6、6.5个百分点，检测速度分别提升了40.6、25帧/s，参数量分别减少了55.5%、27.3%。同时试验证明，模型对光照条件多种变化的采摘场景也具有一定的鲁棒性与泛化能力。该研究提出的RT-Detr-EV模型能够满足复杂生长环境黄瓜果实的实时检测需求，可为后续移动式选择性采摘的研究提供技术支持。

Abstract: Selective picking of ripe cucumber fruits can often be realized under unstructured environments, where the cucumbers grow. However, manual harvesting is high cost and labor intensity. Cucumber-picking robots can be expected to reduce the manpower requirements in modern agriculture. Therefore, the vision system can dominate the accurate and rapid recognition of the fruit, leading to the high efficiency of the robot picking. This study aims to achieve the efficient selective picking of cucumber fruits under complex environments, such as light changes. The RT-Detr-EV model was also proposed to take the RT-Detr as the baseline network. The cucumber was selected as the research object. Firstly, the Re-parameterization VGG module was added into the backbone network. A multi-branch structure was then adopted during training, in order to strengthen the feature extraction of the network for the high recognition accuracy. While the multi-branch structure was merged during inference. The complexity of the network and the amount of computation were reduced to optimize the inference performance; Secondly, the lightweight cascade in the neck network was added into the grouping self-attention mechanism module, in order to reduce the computational overhead. Thus the high detection speed of the model was obtained to increase the depth of the network; Finally, Minimum Point Distance based Intersection Over Union (MPDIoU) also replaced the loss function in the original model. All the intersection and merger ratios were considered to calculate the regression loss of the target frame. The convergence of the model was accelerated to improve the detection accuracy. The results show that the mean average detection precision and speed of the improved RT-Detr-EV reached 95.8% and 61.3 frames/s, respectively, compared with the original model by 3.2 percentage points and 17.4 frames/s, respectively. The accuracy of identifying cucumbers that are not suitable for picking has increased by 4.6 and 6.5 percentage points, respectively, compared with the YOLOv7-X and YOLOv8-l. While the detection speed was improved by 40.6 and 25frames/s, respectively. The number of parameters was reduced by 55.5% and 27.3%, respectively. Meanwhile, the multi-scene application tests show that the RT-Detr-EV shared the high detection accuracy under different light angles. In the smooth light environment, the improved model was improved by 1.2 and 2.8 percentage points, compared with the YOLOv8 and RT-Detr model, respectively; There was improved by 4.5 and 5.5 percentage points under the backlight environment, respectively. Once the exposure level of the picking scene was in the interval of 40%-160%, the mAP₅₀ variations in the improved model were less than or equal to 0.2 and 0.5 percentage points in the smooth light and the backlight condition, respectively; The original magnitude was much smaller than that of the YOLOv8 and RT-Detr under various picking scenes with different light angles. Therefore, the improved model shared the better robustness and generalization udner complex picking scenarios with multiple changes in lighting conditions. In conclusion, the RT-Detr-EV network model was verified with the better performance index in the target detection task during fruit picking under the complex growth environment. The finding can also provide a valuable reference for the target localization task in the selective picking robots.

基于改进RT-Detr的黄瓜果实选择性采摘识别方法

Identifying cucumber fruits during selective picking using improved RT-Detr