Abstract:
As a significant economic crop, the precise identification of wine grape varieties is imperative for effective vineyard management and ensuring the quality of the wine industry.Addressing the challenges posed by the high demand for labeled data, the substantial costs, and the lack of adaptability to new varieties in the variety identification process, this study proposes a two-phase variety identification method based on few-sample learning. This method comprises a prospect extraction phase and a meta-learning phase. First, to mitigate the impact of complex backgrounds on the few-sample learning model, a Deeplabv3+ semantic segmentation model was developed, and the segmented image underwent post-processing, specifically, image cropping was performed to enlarge the pixel area of the leaves in the image, to facilitate the precise extraction of the foreground leaves, and to provide high-quality image inputs for the subsequent model. Second, meta-learning based on the measure of method to This involves measuring the similarity between the samples in the support set and the samples in the query set to recognize variety, and the meta-learning process employs Mobile-CS as the backbone network. The Mobile-CS network is based on the enhancement of MobileNetV2 network structure, which lightens the original network, removes a bottleneck structure, and integrates the CBAM attention mechanism. This enables precise identification of varieties under sample-limited conditions and rapid adaptation in new variety identification tasks. This study constructed a wine grape leaf image dataset containing 30 varieties, with a total of 5,908 field-collected raw images used as experiments. The experimental findings demonstrate that the Deeplabv3+ model attains an average intersection ratio of 97.52% and a pixel accuracy of 98.98% for leaf segmentation, thereby demonstrating its capacity for precise leaf segmentation. In experiments involving limited data samples, the two-stage method proposed in this study attains an average accuracy of 62.27% on the 5-way 1-shot task and 80.06% on the 5-way 5-shot task, which is a superior performance compared with other few-sample learning methods. Furthermore, the backbone network constructed in this study not only has a better performance but also has a smaller number of parameters compared with the classical convolutional neural network. The study also verified the interference of complex background on the few-sample learning model. The accuracy of the dataset after the foreground extraction stage on the 5-way 1-shot task was improved by 11.83 percentage points compared to the original dataset.Ablation experiments were also conducted in this study, demonstrating that the lightweight treatment of the original MobileNetV2 structure and the fusion of the attention mechanism can improve the performance of the model. To assess the generalizability of the model, this study employed a publicly available dataset, Leafsnap, for external validation. The models trained in this study demonstrated 1-shot and 5-shot accuracies of 74.21% and 87.60%, respectively, on the Leafsnap dataset, thereby substantiating their superior generalizability. Finally, the model was qualitatively evaluated in this study using T-SNE visualization, and the Mobile-CS-touched data could be well separated in low-dimensional space. The two-stage variety identification method employed in this research has been shown to have high recognition accuracy and strong generalization ability. This suggests that it has the potential to provide a new solution for intelligent identification technology in the field of agriculture. Future research will combine this method with practical equipment for model testing and optimization. The goal of this combination is to promote the practical application of the technology in agricultural production.