基于高光谱的柑橘叶片氮素含量多元回归分析

黄双萍; 洪添胜; 岳学军; 吴伟斌; 蔡坤; 徐兴

摘要: 快捷、准确、无损地检测柑橘叶片氮（N）素含量，对柑橘树N肥施用的精准动态管理有重大现实意义。以117株园栽罗岗橙为试验研究对象，在不同生长期用ASD公司的FieldSpec3采集柑橘树健康叶片的高光谱反射值，以高光谱反射数据或其变换形式作为柑橘树样本多元矢量描述；用凯氏定氮法同期检测出柑橘树叶的真实N素含量值；在用PCA对高维光谱矢量降维的基础上，利用支持矢量回归算法（SVR）建立高光谱多元表达和N素含量间的映射关系，以实现任意柑橘树N素含量的预测分析。试验结果表明，测试集上预测值和真实值间的平方决定系数R2为0.9730，平均相对误差为0.9033%，均方误差MSE为0.090343，证明了该方法的有效性，为利用高光谱技术进行柑橘树N素含量的无损检测提供了参考。

Abstract: In order to evaluate the nitrogenous status of citrus trees, non-destructively, accurately and rapidly, the modeling of the nitrogen (N) content prediction based on the reflectance spectra is studied in this paper. Field experiments were conducted on 117 planted Luogang citrus trees in the Crab Village of Guangzhou. The citrus trees were divided into several groups and 1-year standardized management was performed on them. Nitrogenous fertilizer was applied to the citrus trees only during four phenological periods in the year, and each group was treated with various levels of N-fertilization in order to cultivate differentiation samples with varied nitrogenous content. 15 days after each fertilization, fresh and healthy citrus leaves were collected to gather training samples from different growth stages. Hyper-spectrometer ASD FieldSpec was used to detect spectral reflectance while the Kjeldahl method was used to measure the N-content of citrus leaves from the same batch. In this way, each sample is described as an instance-label pair, where a multi-variable vector was used as the descriptor and the ground truth of the nitrogen level was used as the label. The collected samples were used to construct a large-scale dataset，80% of which were used as the train set and the remaining 20% were used as the test set. PCA (Principle Component Analysis) was applied to the original vectors for dimension reduction and noise removal and SVR (Support Vector Regression) was adopted to build the regression analysis model for predicting the nitrogen level of the citrus trees. The model relied on a training set and was created by mapping the multi-variable vectors to the related ground truths label through SVR. The test set was used to evaluate the performance of the model. The experiment on the test set resulted in reaching a square correlation coefficient (R2) of 0.9730, a mean relative error of 0.9033%, and a mean square error (MSE) of 0.090343. Conclusions can be drawn from the experimental results: First, compared with various deformations of spectral data, e.g. first derivative spectrum, second derivative spectrum, reciprocal spectrum, logarithmic spectrum, logarithm of reciprocal spectrum, the original high spectral reflectance data, as the vector-descriptor of the samples, achieved the best experimental result when using the approach in this paper. Second, when the Radial Basis Function (RBF) is used as the kernel for SVR and PCA determines the principal components with the cumulative contribution rate set to 99.9%, the model will achieve the best performance and be the most robust. Third, comparative experiments between our method and other mainstream multivariate regression analysis algorithms demonstrate the validity of using SVR and PCA to do modeling. Experimental results show our method is obviously superior to Partial Least Squares (PLS), Back Propagation (BP) and Stepwise Multiple Linear Regression (SMLR). Finally, using SVR to build the regression model based on PCA-processed data successfully achieved the ideal performance index, which indicates the effectiveness of the proposed method and provides a theoretical basis for the applications of high spectral reflectance in non-destructive nitrogen level detection.

基于高光谱的柑橘叶片氮素含量多元回归分析

Multiple regression analysis of citrus leaf nitrogen content using hyperspectral technology