Abstract:
Wine grape has been one of the most significant economic crops. The precise identification of wine grape varieties can greatly contribute to effective vineyard management and the quality of the wine industry. It is still lacking in the adaptability to new varieties in the variety identification, particularly for the labeled data and the substantial costs. In this study, a two-phase variety identification was proposed using few-sample learning, including the extraction phase and meta-learning phase. Firstly, the impact of complex backgrounds was mitigated by the few-sample learning model. A Deeplabv3+ semantic segmentation model was then developed on the segmented image after post-processing. Specifically, image cropping was performed to enlarge the pixel area of the leaves in the image. The precise extraction of the foreground leaves was provided on the high-quality image inputs for the subsequent model. Secondly, metric meta-learning was utilized to evaluate the similarity among the samples in the support dataset and the samples in the query to recognize variety. Mobile-CS architecture was also employed as the backbone network during meta-learning. MobileNetV2 network structure was then enhanced to lighten the original network. A bottleneck structure was also removed to integrate the CBAM attention mechanism. Precise identification of varieties was realized under sample-limited conditions, and then rapid adaptation in new variety identification tasks. An image dataset of wine grape leaves was constructed using 30 varieties, with a total of 5 908 raw images in fields. The experimental results demonstrate that the Deeplabv3+ model achieved an average intersection ratio of 97.52% and a pixel accuracy of 98.98% for precise leaf segmentation. In limited data samples, the two-stage model achieved an average accuracy of 62.27% on the 5-way 1-shot task and 80.06% on the 5-way 5-shot task, indicating superior performance, compared with the rest few-sample learning. Furthermore, the backbone network performed the best with a smaller number of parameters, compared with the classical convolutional neural network. The interference of complex background was also removed on the few-sample learning model. The accuracy of the dataset after the foreground extraction stage on the 5-way 1-shot task was improved by 11.83 percentage points, compared with the original dataset. Ablation experiments were also conducted to verify the improved model. The performance of the improved model was also enhanced to fuse the attention mechanism after the lightweight treatment of the original MobileNetV2 structure. The generalizability was then verified using a publicly available dataset, Leafsnap. There were 1- and 5-shot accuracies of 74.21% and 87.60%, respectively, on the Leafsnap dataset, indicating their superior generalizability. Finally, the T-SNE visualization was used to qualitatively evaluate the improved model. The Mobile-CS-touched data was then better separated in the low-dimensional space. The two-stage variety identification shared high recognition accuracy and strong generalization. A potential solution can also provide for intelligent identification in the field of agriculture. Practical equipment can also be combined with model testing and optimization in future research. This optimal combination can promote the practical application of the technology in agricultural production.