基于KPCA与KLPP及Wilks统计量的留兰香三维荧光数据特征提取与鉴别分析

    Spearmint three-dimensional fluorescence data feature extraction and identification based on KPCA and KLPP and Wilks statistics

    • 摘要: 为实现留兰香产地的快速鉴别,该研究提出了一种核主成分分析(kernel principal component analysis,KPCA)与核局部保持投影(kernel locality preserving projections,KLPP)及Wilks Λ统计量序贯融合的特征波长提取策略,在此基础上鉴别5个产地的留兰香。首先,在采集5个产地300个留兰香样本的三维荧光数据后,运用三角形内插值法去除原始光谱中的瑞利散射和拉曼散射,并运用SG(Savitzky-Golay)对数据进行平滑预处理。然后,对预处理后的荧光光谱数据分别利用KPCA、KPCA+KLPP、KPCA+Wilks Λ统计量、 KPCA+KLPP+Wilks Λ统计量4种方法提取特征激发波长和特征发射波长。接着,按特征激发波长从小到大顺序将其对应的特征发射波长光谱值首尾相连转换成行向量;4种方法从300个样本中各得到1个300行的特征波长光谱值矩阵。再者,运用Fisher判别分析(fisher discriminant analysis,FDA)对特征波长光谱值矩阵进行数据可分性融合,生成可分性FD(fisher discriminant)变量。选取前4个累计判别能力达到99%的FD变量作为鉴别模型的输入向量。最后,用支持向量机(support vector machine,SVM)算法分析4个FD变量,分别得到对应于4种特征提取波长方法的FDA+SVM鉴别结果,其正确率分别为92.00%、96.00%、94.67%、100%。结果表明,所提出的KPCA+KLPP+Wilks Λ统计量序贯融合的特征波长提取策略能够有效减少三维荧光光谱数据的冗余,并能表征原始荧光数据的信息特征,实现了5种留兰香产地的正确鉴别。该研究可为后续利用三维荧光光谱开展留兰香重要组分量化分析提供一定的基础。

       

      Abstract: This study aims to rapidly and accurately identify the origin of spearmint. A feature wavelength extraction was proposed to combine the sequential fusion of Kernel Principal Component Analysis (KPCA) with Kernel Locality Preserving Projections (KLPP) and Wilks' Λ statistics. Effective discrimination was realized among the five origins of spearmint. Firstly, the three-dimensional fluorescence data was collected from the 300 spearmint samples in five locations. The interferences of Rayleigh and Raman scattering were removed from the original spectra by triangular interpolation. Savitzky-Golay (SG) was used to smooth and then eliminate the noise. Secondly, the preprocessed three-dimensional fluorescence spectrum matrix was merged into the data matrix, according to the emission wavelength. KPCA, KPCA + KLPP, KPCA + Wilks Λ statistics, and KPCA + KLPP + Wilks Λ statistics were used to extract the feature wavelengths of the three-dimensional fluorescence spectrum matrix after preprocessing. Among them, KPCA and KPCA + Wilks Λ statistics were used to extract 40 wavelengths of feature emission and eight wavelengths of feature excitation, respectively. KPCA+KLPP and KPCA + KLPP + Wilks Λ statistics were to extract 50 wavelengths of feature emission and 10 wavelengths of feature excitation, respectively. Thirdly, the spectral values of feature emission wavelength were converted into row vectors, according to the wavelength order of feature excitation. Four 300-row spectral matrices of feature wavelength were then obtained from the 300 samples. Furthermore, Fisher discriminant analysis (FDA) was applied to the spectral matrix of feature wavelengths for data separable fusion for the separable fisher discriminant (FD) variables. The first four FD variables were selected with a cumulative discriminant ability of 99% as the input vectors of the discriminant model for the spearmint origins. Finally, the support vector machine (SVM) was used to analyze the four FD variables. The FDA + SVM identification model showed that there was a four-feature wavelength extraction. The correct rates for the training set were 92.00%, 97.78%, 95.11%, and 100%, respectively, and the correct rates for the test set were 92.00%, 96.00%, 94.67%, and 100%, respectively. The results show that both KLPP and Wilks Λ statistics improved the performance of KPCA. The sequential fusion significantly improved the overall feature extraction, in order to better remove the redundant information of spectral data for the high correct of identification. Therefore, the sequential fusion of KPCA + KLPP + Wilks Λ statistics + FDA + SVM can be expected to effectively identify the origin of spearmint. The extraction of feature wavelength with sequential fusion of KPCA + KLPP + Wilks Λ statistics can effectively reduce the redundancy of three-dimensional fluorescence spectral data for the informative features of the original fluorescence data. Better identification was achieved in the five kinds of spearmint origins. In addition, these research findings can lay a foundation for the subsequent quantitative analysis of the important components in spearmint by three-dimensional fluorescence spectroscopy.

       

    /

    返回文章
    返回