频率直方图与植被指数结合的冬小麦遥感产量估测

    Estimating winter wheat yield under frequency histogram and vegetation index using remote sensing

    • 摘要: 基于机器学习方法建立的作物产量估测模型常因过拟合等问题导致泛化性能偏低,产量估测精度不高。该研究以河南省为研究区,分别对不同波段地表反射率数据采用均值法和频率直方图法构造样本特征集作为输入变量,结合随机森林(Random Forest,RF)算法建立冬小麦遥感估产模型。研究结果表明,频率直方图法预测效果优于均值法,平均绝对误差和均方根误差分别为660和860 kg/hm2,决定系数最高达到0.83,达极显著水平(P<0.01);7个地表反射率波段中,近红外1波段表现最好;单个合成指数中,归一化水分指数的表现要优于归一化植被指数;波段组合中,归一化植被指数和归一化水分指数的组合验证效果最优,平均绝对误差和均方根误差分别为444和527 kg/hm2,决定系数为0.89,达极显著水平(P<0.01),其组合预测效果在4月15日至22日时段内表现最佳,该时段对冬小麦产量的影响最大。该研究通过采用基于频率直方图法构建样本特征结合随机森林算法建立冬小麦遥感估产模型,可为县域冬小麦遥感估产提供一种有效的解决方案。

       

      Abstract: A crop yield estimation can provide the data support for the decision-making in the entire crop growing season. The previous estimation models of winter wheat yield often perform a low accuracy and less generalization, due to the overfitting during machine learning. Meanwhile, the specific feature variables can be expected to reduce the high collinearity between spectral bands of surface reflectance. It is also proper to set the spectral index for the less influence of independent wavelengths on iterative calculation. In this study, an improved estimation model of winter wheat yield was established using Random Forest (RF) in Henan Province of China, using the surface reflectance data and spectral index during the growth period. The specific procedure was listed. 1) The comparisons were conducted for the regional mean values and regional frequency histogram on the spectral bands of surface reflectance. 2) The different spectral bands of surface reflectance were then compared. 3) The generated sample data on the specific feature variable was used to train a yield estimation model. 4) The yield estimation model was applied to evaluate the winter wheat yields during 2013-2015, using the surface reflectance data and spectral index. The estimated yields of winter wheat were also verified by the survey statistics. The results showed that the frequency histogram presented a better performance than the regional mean values, where the Mean Absolute Error (MAE) was 660 kg/hm2, Root Mean Square Error (RMSE) was 860 kg/hm2, and the coefficient of determination was 0.83. The dimension of input feature variables and computational complexity were greatly reduced to improve the estimation ability of the model. The frequency histogram was suitable for the extraction of feature variables, further to optimize the sample structure. Among the surface reflectance bands, the near-infrared band 1 behaved the best performance, where the MAE was 636 kg/hm2, RMSE was 784 kg/hm2, and the coefficient of determination was 0.76. The near-infrared band 2 was the second most. The estimation accuracy of the blue band was the lowest, where the coefficient of determination was only 0.58, while the MAE and RMSE were 885 and 1040 kg/hm2, respectively. The coefficient of determination of Normalized Difference Water Index (NDWI) was 0.82, whereas, the coefficient of determination of Normalized Difference Vegetation Index (NDVI) was 0.73. Moreover, there was a higher coefficient of determination in the combination of NDVI and NDWI, where the coefficient of determination was 0.89, and the MAE and RMSE were 444 and 527 kg/hm2, respectively. There was the crucial influence of the heading stages of NDVI and NDWI on the winter wheat yield. Specifically, the accuracy of the combined NDVI and NDWI reached the highest on April 15th, indicating that the period from April 15th to 22nd presented the greatest impacts. In addition, the inter-year variation patterns from 2013 to 2015 showed well consistent trends between the estimated and observed yields in the spatial distribution, where the relative errors were within ±20%, and gradually decreased from the northwest to the southeast. A comprehensive effect of topography, temperature, and precipitation can be attributed to the surface microclimate and material redistribution. Consequently, the better yield estimation of winter wheat over the large areas can be achieved in the main winter wheat planting regions of Henan Province, using the random forest model combined with the regional frequency histogram samples. The next step can be recommended to maximize the number of samples and introduce the high-resolution satellite images for higher accuracy of the winter wheat yield estimation model.

       

    /

    返回文章
    返回