基于改进YOLOv11n的复杂环境下柑橘识别

    Citrus recognition in complex environments based on the improved YOLOv11n

    • 摘要: 针对复杂环境下柑橘果实曝光不足、枝叶遮挡且现有检测方法难以准确识别的问题,提出一种基于改进YOLOv11n的复杂环境柑橘检测模型。该研究设计了倒置残差注意力移位卷积IRSC(inverted residual-attention shiftwise convolution)模块并使用它对原主干网络的C3k2(cross-stage partial with kernel-size 2)模块进行改进,提高了模型的检测性能。为了改善模型在低光图像上特征提取难的问题,使用低光照增强网络Retinexformer(one-stage retinex-based transformer)模块进行图像增强,提高了模型的特征提取能力。最后使用下采样ADown模块对部分普通卷积进行替换,降低了模型复杂度。试验结果表明,改进后的YOLOv11n模型在复杂环境下柑橘检测中,平均精度均值mAP@0.5为87.1%,召回率为79.1%,与原模型相比,分别提高了1.9、3.0个百分点,参数量和模型大小分别降低了8.5%、4.0%。该研究提出的方法有效提高了复杂环境下柑橘的检测精度,为柑橘的自动化管理和采摘提供技术参考。

       

      Abstract: Aiming at the problems of insufficient exposure of citrus fruits in complex environments, obstruction by branches and leaves, and the difficulty of accurate identification by existing detection methods, a citrus detection model for complex environments based on the improved YOLOv11n is proposed. The objective of this study is to enhance detection robustness and accuracy under challenging orchard conditions including low-light scenarios, heavy occlusion, and cluttered backgrounds. This study designed an IRSC (inverted residual-attention shiftwise convolution) module and used it to improve the C3k2 (cross-stage partial with kernel-size 2) module of the original backbone network. The IRSC module adopts a spatial weighting mechanism with morphological prior Gaussian bias to enhance the feature response of key parts, guiding attention to focus more on the target center region and fully utilizing the circular morphological characteristics of citrus targets. By emphasizing important regions, the model can better focus on the critical aspects of citrus fruits, even when they are occluded by branches and leaves. Meanwhile, the IRSC module combines an inverted residual design to expand the receptive field of weak features. This enables the model to capture more contextual information, which is particularly beneficial for detecting small targets and weak features in low-light conditions. Moreover, it can break through the local receptive field limitation of small convolution kernels. Without modifying the network architecture, it effectively simulates the global context modeling advantages of large-kernel convolution. This not only improves the model's detection performance but also maintains the original network structure, making the improvement more efficient and targeted. To address the difficulty of feature extraction for the model on low-light images, the Retinexformer (one-stage retinex-based transformer) module of the low-light enhancement network is utilized for image enhancement. The Retinexformer module has strong multi-scale illumination decomposition capability, which enables end-to-end enhancement of dark areas. By decomposing the illumination on multiple scales, it can accurately enhance the dark regions where citrus fruits may be located. This process significantly improves the visibility of citrus in low-light areas, making the fruits more distinguishable from the background, and the model's feature extraction ability is greatly enhanced. In the case of underexposure, due to the low contrast of the image, indistinct detail contours, and uneven illumination, object detection algorithms often struggle to extract the features of citrus fruits in orchards. The introduction of the enhancement method into YOLOv11n effectively improves the contrast and illuminance of the image while reducing image noise. This leads to better feature extraction of citrus fruits in orchards and improves the accuracy of the object detection algorithm. Furthermore, the ADown module for downsampling was employed to replace some of the common convolution. The ADown module is designed with multi-branch parallel processing and feature fusion. This design allows the model to process features from different branches simultaneously and then fuse them effectively. The model complexity is reduced, and the number of parameters is decreased while maintaining the model accuracy. Experimental results on a complex orchard citrus dataset show that, the average accuracy of the improved YOLOv11n model reaches an mAP@0.5 of 87.1%, with a recall rate of 79.1%. Compared with the original model, the mAP@0.5 and recall rate have increased by 1.9 and 3.0 percentage points respectively. Additionally, the number of parameters and the model size have been reduced by 8.5% and 4.0% respectively. Ablation studies demonstrate that each module contributes significantly to the overall performance improvement: the Retinexformer module alone increases mAP@0.5 by 1.1 percentage points, the C3k2-IRSC module contributes an additional 0.6 percentage points, and the combination of Retinexformer and C3k2-IRSC achieves 1.5 percentage points improvement. The ADown module reduces parameters by 17.4% when used alone, while the final model with all three modules achieves an 8.5% parameter reduction. The three modules formed an effective collaborative mechanism: Retinexformer provided standardized input through image enhancement, IRSC extracted global contextual features based on the enhanced input, and ADown balanced feature preservation and compression while controlling parameter growth. It achieves the best balance between accuracy and efficiency, with the collaborative mechanism providing synergistic gains beyond the sum of individual module contributions. This collaborative design enables the detector to cope with underexposure, heavy occlusion, and complex backgrounds in orchards within a unified framework. In comparison with mainstream object detection models, including Faster R-CNN, SSD, EfficientDet, YOLOv5n, YOLOv8n, YOLOv10n, YOLOv12n, and RT-DETR-l, the proposed model attains higher mAP@0.5 by 1.8 to 25.6 percentage points, with parameter reductions ranging from 5.6% to 92.6%, indicating clear advantages in both detection accuracy and model compactness. The method successfully addresses the key challenges of low-light conditions, occlusion, and complex backgrounds that commonly occur in real-world citrus orchards. It provides valuable technical references for the automated management and picking of citrus, paving the way for more efficient and intelligent citrus production systems.

       

    /

    返回文章
    返回