吴彤, 李勇, 葛莹, 刘凌杰, 席顺忠, 任孟杰, 袁晓慧, 庄翠珍. 利用Stacking集成学习估算柑橘叶片氮含量[J]. 农业工程学报, 2021, 37(13): 163-171. DOI: 10.11975/j.issn.1002-6819.2021.13.019
    引用本文: 吴彤, 李勇, 葛莹, 刘凌杰, 席顺忠, 任孟杰, 袁晓慧, 庄翠珍. 利用Stacking集成学习估算柑橘叶片氮含量[J]. 农业工程学报, 2021, 37(13): 163-171. DOI: 10.11975/j.issn.1002-6819.2021.13.019
    Wu Tong, Li Yong, Ge Ying, Liu Lingjie, Xi Shunzhong, Ren Mengjie, Yuan Xiaohui, Zhuang Cuizhen. Estimation of nitrogen contents in citrus leaves using Stacking ensemble learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(13): 163-171. DOI: 10.11975/j.issn.1002-6819.2021.13.019
    Citation: Wu Tong, Li Yong, Ge Ying, Liu Lingjie, Xi Shunzhong, Ren Mengjie, Yuan Xiaohui, Zhuang Cuizhen. Estimation of nitrogen contents in citrus leaves using Stacking ensemble learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(13): 163-171. DOI: 10.11975/j.issn.1002-6819.2021.13.019

    利用Stacking集成学习估算柑橘叶片氮含量

    Estimation of nitrogen contents in citrus leaves using Stacking ensemble learning

    • 摘要: 准确估算柑橘叶片氮含量对于科学合理的施肥具有重要的指导作用,该研究利用Landsat8 OLI卫星遥感影像和地面采样实测数据,以K-近邻(K-Nearest Neighbors,KNN),随机森林(Random Forest,RF)和自适应增强(Adaptive boosting,Adaboost)模型为基础,构建Stacking集成学习框架,实现对柑橘叶片氮含量(Leaf Nitrogen Content,LNC)的估算。首先分析不同氮含量下的光谱反射特征,构建植被指数(Vegetation Indices,VIs)并计算其与柑橘LNC的相关系数;接着利用格网搜索、交叉验证训练模型,最后将Stacking模型与包括Bagging(Bootstrap Aggregating,Bagging)、人工神经网络(Artificial Neural Network,ANN)在内的多个经典机器学习模型试验结果进行对比分析,并生成柑橘果园的氮含量分布图。结果表明:1)构建的光谱指数与LNC具有较好的相关性,大部分指数相关系数在0.55以上;2)相比KNN、RF、Adaboost等多个单一模型,Stacking模型的估算效果最佳,决定系数达到0.761,均方根误差为1.366 g/kg,平均绝对百分比误差为3.494%;同时,Stacking模型的赤池信息准则(Akaike Information Criterion,AIC)值最低,是观测期内LNC估算的最优模型;3)研究区内LNC值整体上处于30.5~31.5 g/kg左右,接近柑橘种植的理想区间,模型估算与实测值趋于一致。总体上,该研究采用的光谱特征能够有效表征柑橘冠层叶片氮含量,并证明Stacking集成学习能综合多个基模型的优点,提高模型的准确性,为利用卫星遥感展开作物参数估算提供新的思路。

       

      Abstract: Leaf Nitrogen Content (LNC) is an important indicator to evaluate the quality and yield of fruits, where nitrogen is an essential nutrient element for the growth of citrus. Satellite remote sensing has been widely used to rapidly and nondestructively capture nitrogen content data for the cultivation and production of citrus in recent years. In this study, a two-layer stacking ensemble learning framework was constructed using Landsat8 OLI satellite remote sensing images and ground sample data, thereby accurately estimating the nitrogen content of citrus leaves in critical growth periods. K-Nearest Neighbor (KNN), Random Forest (RF), and Adoptive boosting (Adaboost) were utilized as base models, whereas, Linear Regression (LR) was employed as the meta-model. The LNC values were sorted from high to low and then divided into 6 groups at equal intervals. A systematic analysis was also made to compare the spectral characteristics under different LNC. There were significant differences in the spectral reflectance in the visible light range (400-760 nm) and near-infrared band (760-1 250 nm), due mainly to the absorption of chlorophyll and the multiple reflections of the canopy. The trees with higher LNC commonly presented lower spectral reflectance. The correlation coefficient between vegetation indices (VIs) and LNC was calculated to optimize the spectral features. Grid search and 5-fold cross validation were utilized to train the model, where the LNC distribution map was generated for the study area. The results showed that the Stacking presented the best performance in the testing dataset, with coefficient of determination (R2) of 0.761, Mean Absolute Error (MAE) of 1.046 g/kg, Root Mean Squared Error (RMSE) of 1.366 g/kg and Mean Absolute Percent Error (MAPE) of 3.494%.. Compared with Adaboost, the best performance was achieved using individual models, where the R2 increased by 0.025, whereas RMSE, MAE, and MAPE decreased by 0.07, 0.109 g/kg and 0.325 percentage point, respectively. It revealed that the Stacking was fully integrated into the base models for a higher estimation accuracy. However, there was an obviously underestimated phenomenon in the measured value from each model, particularly in the LNC estimated values of >32 g/kg. Meanwhile, by comparing the Akaike Information Criterion (AIC) of each model, the AIC value of Stacking was significantly lower than other individual model, indicating that Stacking was the best LNC estimation model in the observation period of this study. In addition, soil background and model performance were discussed. The spectral information was interfered by soil background. Based on the concept of soil line, many researchers proposed some VIs to reduce the influence, which were also adopted. In this study, only spectral features were used to build the model, which limited the capability of the model. It would be considered to increase the observation periods and add texture features to construct a more comprehensive estimation model. In summary, Stacking could accurately and effectively estimate citrus LNC, providing the potential to estimate the nitrogen content in citrus leaves using satellite remote sensing.

       

    /

    返回文章
    返回