基于Tea DCGAN网络和Fake Tea框架的茶鲜叶数据增强方法

俞焘杰; 陈建能; 彭伟杰; 李亚涛; 喻陈楠; 武传宇

doi:10.11975/j.issn.1002-6819.202405076

基于Tea DCGAN网络和Fake Tea框架的茶鲜叶数据增强方法

Fresh tea leaf data enhancement method based on Tea DCGAN network and Fake Tea pipeline

摘要

摘要: 当茶叶图片的原始数据数量不足时，深度学习模型泛化性不足导致对茶叶嫩梢的检测能力大幅度下降。为解决这一问题，该研究提出一种Tea DCGAN（tea deep convolution generative adversarial networks）的对抗生成网络及其数据增强方法。首先，在DCGAN（deep convolution generative adversarial networks）网络的生成器和判别器中分别添加了64×64×64的网络层来优化模型对低维度特征感知与学习能力。同时，DCGAN中的LeakyReLU（leaky rectified linear unit）函数被更加线性可控的ELU（exponential linear units）函数替换，提升模型训练稳定性与训练精度。其次，基于Tea DCGAN网络构建Fake Tea数据增强算法框架，对已有数据集的真实茶叶嫩梢分布进行数据分析，得到分布规律。根据分布规律将Tea DCGAN网络生成的样本图像分布进已有的露天茶树图像中，并自动形成深度学习数据集。最后，对该研究提出的数据增强方法进行对抗生成网络消融试验、罕见茶种对照试验以及不同量级下的多种数据增强方法对比试验。消融试验结果显示，Tea DCGAN在FID(frechet inception distance）指标上表现最优，特别是在100000训练轮次时，紫鹃茶种的FID值从322.10降至265.63，龙井43茶种的FID值从396.38降至323.09，提升了生成图像的质量。在多个检测模型的多种数据增强方法试验中，该研究Fake Tea方法在不同检测模型中均优于其他方法。其中，Faster R-CNN模型在25张龙井43和25张紫鹃茶种形成的数据集上mAP（平均精度，Mean Average Precision）分别达到42.71%和38.46%。随着数据集规模的增加，所有方法的性能均有所提升，但Fake Tea方法在所有规模的数据集上均保持最高mAP值，尤其是在原始数据为200张时，mAP值达到89.41%，可用于智能化茶叶采摘。研究结果证明了Tea DCGAN和Fake Tea数据增强方法在茶叶图像生成和目标检测任务中的有效性和优越性。该研究提出的Tea DCGAN和Fake Tea数据增强方法可有效缓解数据获取困难、样本不足等问题，有效提升小样本下的茶叶嫩梢目标检测精度。

Abstract: A significant compromise can often be found between the generalization of models and the insufficient original data of tea leaves in the field of deep learning, leading to a substantial decline in the detection of tender tea shoots. In this study, a Tea DCGAN (tea deep convolution generative adversarial networks) was proposed with its data augmentation. Initially, a 64×64×64 layer was added into both the generator and discriminator of the DCGAN (deep convolution generative adversarial networks), in order to enhance the perception and learning of low-dimensional features. Additionally, the LeakyReLU (leaky rectified linear unit) function in the DCGAN was replaced with the more linearly controllable ELU (exponential linear units) function, thereby improving the stability and accuracy of model training. Subsequently, a Fake Tea data augmentation framework was constructed using the Tea DCGAN network. The distribution of real tender tea shoots in the existing dataset was analyzed to determine the underlying patterns. According to these patterns, the sample images generated by the Tea DCGAN network were distributed into real tea tree images. A deep learning dataset was formed automatically. Finally, several experiments were carried out on data augmentation, including adversarial generative network ablation tests, and rare tea variety control tests. A comparison was also performed on various data augmentation at different scales. The ablation test results indicated that Tea DCGAN performed best, in terms of the FID (Frechet Inception Distance) metric. Especially after 100 000 training epochs, the FID values for the Zijuan and Longjing 43 tea varieties dropped from 322.10 to 265.63, and from 396.38 to 323.09, indicating the significantly high enhancement in the quality of the generated images. Fake Tea framework was outperformed over the various experiments of the detection model with multiple data augmentation. Specifically, the Faster R-CNN model was achieved in the mAP of 42.71% and 38.46% on the datasets with 25 Longjing 43 and 25 Zijuan tea pictures, respectively. The performance of the models was improved, as the dataset size increased. But the Fake Tea was consistently maintained on the highest mAP value across all dataset sizes. Notably, when the original dataset consisted of 200 images, the mAP value reached 89.41% suitable for intelligent tea harvesting. Therefore, there were the high effectiveness and superiority of Tea DCGAN and the Fake Tea data augmentation in the tea leaf image generation and object detection. The Tea DCGAN and Fake Tea data augmentation effectively enhanced the data acquisition to avoid the scarcity of samples. The high accuracy of detection was achieved in the tender tea shoot under various scenarios with limited samples.

HTML全文

参考文献(36)

施引文献

资源附件(0)