Evaluation of tea similarity based on deep metric learning
-
Graphical Abstract
-
Abstract
Abstract: Blending is often used to stabilize the quality of tea in refined processing. The quality standardization of export Mee tea can be realized to fully meet the requirements of national standards. Experts first make a small sample in the process of blending, according to the historical plan. The original proportion can also be adjusted, as the quality of tea materials varies each year. Then, the sensory evaluation can be used to judge the similarity between the sample and the standard sample. However, the highly subjective and lacking quantitative procedure is not conducive to the realization of tea quality standardization. In this study, a systematic evaluation was proposed to objectively and quantitatively evaluate the similarity among tea samples using Deep Metric Learning (DML). Standard samples of 7 grades of Mee tea were used as the training sets. The semi-finished tea with different contents was added to the standard samples to construct test sets with different similarities. Semi-finished teas included Fujian, Zhejiang Xikou tea and Huangshan fourth grade broken tea. Hyperspectral data from tea samples was then collected. Region of Interest (ROI) was selected to obtain the spectral features of the samples. The tea spectra were also preprocessed by the Multivariate Scattering Correction (MSC). The principal component images with the highest contribution rate were selected using Principal Component Analysis (PCA). The texture features were obtained as the image features of hyperspectral images by the Gray-level co-occurrence Matrix (GLCM). Spectral, image, and atlas fusion data were used as the input of the model. A deep feature extraction Network was also constructed to obtain the distance feature space using triple Loss. Convolutional Neural Network (CNN) was selected as the feature extraction Network. The Center Anchor Triplet Loss function was finally proposed. A mapping was learned from the original to the distance feature space. The same kinds of tea in the distance were made in the distance of the feature space as close as possible, whereas, the distance of the different types of tea in the distance feature space was away as far as possible. The training set data was separated for each type of standard sample feature mean value as the benchmark after the model generated the standard sample feature under the distance feature space. The deep metric learning model was transferred to the test set. The Euclidean distance was also calculated between the test set features and each benchmark feature. The similarity between tea samples was characterized by the distance between the samples in the feature space. As such, the qualitative judgment of similarity was achieved in the quantitative measurement of similarity. In addition to input data type and loss function, the output characteristic dimension of the network also dominated the accuracy of distance measurement. Therefore, the fine-tuned network structure was obtained, where the output dimension of the model was verified for 10 to 100 and traversed with 10 steps. The results show that: when the model output dimension was 40, the fusion data combined with Center Anchor Triplet Loss presented the highest accuracy, which was superior to others. Specifically, the accuracy of similarity judgment was 98.89%, and the accuracy of similarity measurement was 100%. The evaluation model of the untrained independent sample was used to obtain a better performance, indicating the better generalization ability of the improved algorithm. The findings can also provide the theoretical basis and data support for the similarity evaluation of export tea.
-
-