Abstract:
Ginseng has been regarded as the top of conventional Chinese medicine. Its root can be widely used as an herbal medicine with a long history and multiple medicinal values; Thus, it is very crucial to monitor and manage ginseng quality. In this study, a lightweight CGC-YOLOv8 model was introduced into the detection system, in order to classify and assess the appearance quality of ginseng. Ginseng was harvested in Fusong County, Jilin Province, China. An experimental sample was then selected in the current year. Data enhancement techniques were used to simulate the various typical environments. In the basic network design of the model, the regular convolutional layer was replaced with two conditional convolutional layers (CondConv) in the backbone, according to the architecture of the YOLOv8 model. The appearance feature of input ginseng was highlighted to enhance the accuracy and efficiency of feature extraction. In addition, a thin neck combination (GSconv+VoVGSCSP) in the neck structure was applied to lighten the model. The coordinate Attention mechanism (CA) was incorporated to detect the head of the model. The performance was effectively improved without increasing the computational burden, particularly suitable for resource-limited devices. The CGC-YOLOv8 model was then optimized by ablation experiments. The better performance was achieved, with 85.70% precision, a recall of 91.23%, IoU=0.50 mean average precision (mAP50) of 94.69%, and IoU=0.95 mean average precision (mAP50-95)of 72.43%. Furthermore, the CGC-YOLOv8 improved the precision by 2.74, recall by 4.64, mAP50 by 3.71, and mAP50-95 by 4.09, respectively, compared with the YOLOv8n. The improved model outperformed the original one when detecting images without training. In addition, the number of parameters was reduced by 6.39% in the CGC-YOLOv8 model, compared with the original. The weight size was only 6.0 MB, fully meeting the lightweight condition and easy to deploy on mobile devices. A series of experiments were conducted to compare the effects of different attention mechanisms (CBAM, EMA, SE, SIMAM, and CA) on the improved YOLOv8n model. The application of the CA mechanism was also explored in the different feature layers (such as before SPPF, after SPPF, upsampling P4 layer, and small object detection layers P3, P4, and P5). The results demonstrated that the CA mechanism performed best when applied to the P5 layer. The recall and mean average precision were significantly enhanced in the improved model. The CGC-YOLOv8 model performed better than the conventional SSD EfficientDet and Yolo series models. Specifically, the precision of CGC-YOLOv8 was improved by 0.43% to 48.6%, compared with the rest. Secondly, the recall of CGC-YOLOv8 was improved by 5.70% to 35.86%. The mean average precision (mAP50)was improved by 3.29% to 57.21%. Finally, the mean average precision at the higher thresholds (mAP50-95) of the CGC-YOLOv8 was improved by 13.06% to 60.43%. Therefore, the CGC-YOLOv8 model significantly outperformed the original and the rest, in terms of precision, recall, and mean average precision (mAP). Anyway, the CGC-YOLOv8 model can be expected to serve as a highly effective solution for intelligent detection of ginseng quality. The findings can also provide solid technical support for further advancements in the field.