冬枣氮素含量预测模型中特征波长选择方法的应用
Application of spectral screening method on prediction model of nitrogen content of jujube leaves
-
摘要: 为了提高近红外光谱法快速测定枣叶氮含量的准确性和鲁棒性。采用偏最小二乘法建立了冬枣叶片氮含量近红外光谱模型。模型的相关系数为0.799,均方根误差为0.055。整个光谱区域包含了许多与冬枣氮含量无关的光谱变量。冗余信息的存在降低了模型的预测性能。所以采用间隔偏最小二乘(IPLS)结合遗传算法和模拟退火算法来选择冬枣叶片氮含量的特征波长。用凯氏定氮法测定冬枣叶样品的氮含量。试验选用15棵枣树,每棵树5个叶片作为试验对象。用于光谱测量的仪器是ASD光谱仪,测试仪在350~2 500 nm波长范围内,光谱分辨率为1 nm。在数据采集前使用了白板进行校正(标准白板反射系数为1),每个样品测量了5次,取平均值作为样品的相对反射率。遗传算法结合间隔偏最小二乘法选取的4个特征波长为685,689,781,783 nm。根据这4个波长,建立了冬枣叶片氮含量近红外光谱模型。模型预测相关系数为0.9175,预测均方根误差为0.063。利用模拟退火算法,建立了7个波长的冬枣叶片氮含量的近红外光谱模型。模型的相关系数为0.9301,均方根误差为0.052。因此,近红外光谱结合光谱选择方法的特点,可以有效地提高模型的精度,使模型更实用。但光谱选择方法的特点并不普遍。基于单波长变量选择的模型更为敏感,更适用于均匀采样。基于波长间隔选择的模型抗干扰能力相对较强,但更适合于不均匀采样。因此,基于状态与模型相结合的特征选择可以更好地应用于模型。Abstract: To improve the accuracy and robustness of near infrared spectroscopy technique for rapid determination of jujube leaves’ nitrogen content, partial least squares (PLS) were used to establish the jujube leaf nitrogen content model in near infrared spectroscopy.Correlation coefficient of model was 0.799 and root mean square error was 0.055.The whole spectral region contains many spectral variables which has nothing to do with jujube leave nitrogen content.The existence of the redundant information reduces the prediction performance of the model.So the interval partial least squares (IPLS) combined with genetic algorithm and simulated annealing algorithm were used to extrat wavelength of the nitrogen content of jujube leaves.The nitrogen contents of jujube leaf samples were determined by Kjeldahl analysis method.The experiment selected 15 jujube trees; each tree selected 5 leaves as a test target.Spectral measurement instrument used in test is ASD spectrometer, and wavelength range of the machine is 350-2 500 nm, and the spectral resolution is 1nm.Whiteboard correction(standard whiteboard reflectivity is set to 1) was used before data collection, and each spectra sample were measured for 5 times, taking the average value as the relative reflectivity of the sample.The genetic algorithm combined with interval partial least squares method selected the four characteristic wavelengths 685, 689, 781, 783 nm.The nitrogen content of jujube leaves′ near infrared spectroscopy model was established according to these four wavelengths.Prediction correlation coefficient of model is 0.9175, and predicted root mean square error is 0.063.The near infrared spectroscopy model of jujube leaves′ nitrogen content was built based on seven wavelengths selected by simulated annealing algorithm.The correlation coefficient of model is 0.9301 and root mean square error is 0.052.Therefore, near infrared spectroscopy combined with the characteristics of the spectral selecting methods can effectively improve the accuracy of the model, making the model more practical.But the characteristics spectrum selecting methods are not universal.The model based on single wavelength variable selection is more sensitive, and it is more applicable to uniform samples.While the anti-interference of the model built based on wavelength interval selection is relatively stronger, and it is more suitable for heterogeneous samples.Therefore, the feature selection can be better used based on the combination of the state and the model.