王耀民, 陈皓锐, 陈俊英, 王慧芸, 邢正, 张智韬. 光谱指数筛选方法与统计回归算法结合的水稻估产模型对比[J]. 农业工程学报, 2021, 37(21): 208-216. DOI: 10.11975/j.issn.1002-6819.2021.21.024
    引用本文: 王耀民, 陈皓锐, 陈俊英, 王慧芸, 邢正, 张智韬. 光谱指数筛选方法与统计回归算法结合的水稻估产模型对比[J]. 农业工程学报, 2021, 37(21): 208-216. DOI: 10.11975/j.issn.1002-6819.2021.21.024
    Wang Yaomin, Chen Haorui, Chen Junying, Wang Huiyun, Xing Zheng, Zhang Zhitao. Comparation of rice yield estimation model combining spectral index screening method and statistical regression algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(21): 208-216. DOI: 10.11975/j.issn.1002-6819.2021.21.024
    Citation: Wang Yaomin, Chen Haorui, Chen Junying, Wang Huiyun, Xing Zheng, Zhang Zhitao. Comparation of rice yield estimation model combining spectral index screening method and statistical regression algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(21): 208-216. DOI: 10.11975/j.issn.1002-6819.2021.21.024

    光谱指数筛选方法与统计回归算法结合的水稻估产模型对比

    Comparation of rice yield estimation model combining spectral index screening method and statistical regression algorithm

    • 摘要: 为了探寻高效的水稻产量估算方法,在获取2019年黑龙江省三江平原别拉洪河流域内水稻产量数据和MOD09A1遥感数据基础上,对比不同指数筛选方法和统计回归算法结合的建模估产效果,以得到其中最佳的产量估算模型。通过相关系数(correlation coefficient,r)分析法、变量投影重要性(Variable Importance in Projection,VIP)分析法和袋外数据重要性(Out-Of-Bag data importance,OOB)分析法分析水稻4个生育期(分蘖期、抽穗期、孕穗期和乳熟期)的不同波段和光谱指数对于水稻产量的敏感性,筛选出特征波段和指数,再结合随机森林(Random Forest,RF)、支持向量机(Support Vector Machine,SVM)和偏最小二乘(Partial Least Squares,PLS)3种统计回归方法,构建了9种水稻产量估算模型:r-RF、r-SVM、r-PLS、VIP-RF、VIP-SVM、VIP-PLS、OOB-RF、OOB-SVM、OOB-PLS。结果表明:同一指数筛选方法对不同模型的契合程度不同,OOB与RF更为契合,VIP和r与PLS更为契合,r与SVM更为契合;在3种建模方法中偏最小二乘模型和支持向量机模型有较好的效果,随机森林模型效果最好,其中OOB-RF模型最优,其模型验证决定系数为0.742,均方根误差为206 kg/hm2。研究结果可为水稻产量估算模型研究提供参考,具有一定的理论意义。

       

      Abstract: Abstract: A crop yield is one of the most important parameters in agricultural production. An accurate estimation of regional crop yield can greatly contribute to agricultural production management and national food policy. However, only a few studies have been focused on the combined effects of different exponential screening and statistical regression at present, even though there are various models of crop yield estimation. In this study, a comparative investigation was performed on the three types of index screening and three regression models, in order to explore the coordinated effect of the estimation model for the rice yield. The influence mechanism was also proposed to achieve an optimal yield estimation model suitable for the local production conditions. An important rice-producing area, the Sanjiang Plain in the Heilongjiang Province of China was taken as the study area. The rice unit yield and MOD09A1 remote sensing data were collected in the Bielahong River basin of the study area in 2019. After preprocessing, a total of 36 remote sensing variables were obtained, where four original bands and five vegetation indices of rice at the four growth stages, including the tillering, booting, heading, and milk ripening stage. Subsequently, the remote sensing variables were screened for the high sensitivity to the rice yield using the correlation coefficient (r), Variable Importance in Projection (VPI), and Out-Of-Bag (OOB) data importance analysis. After that, nine estimation models of rice yield were constructed to combine with the Random Forest (RF), Support Vector Machine (SVM), and Partial Least Squares (PLS) regression, such as the r-RF, r-SVM, r-PLS, VIP-RF, VIP-SVM, VIP-PLS, OOB-RF, OOB-SVM, and OOB-PLS. Several experiments were carried out for each model. Thus, the best input data was achieved for the optimal model. The determination coefficient, Root Mean Square Error (RMSE), and normalized Root Mean Square Error (nRMSE) were also used to evaluate the model. The results showed that the same index screening was fitted the different models with different degrees, where the OOB was more suitable for RF, the VIP was more suitable for r and PLS, and the r was more suitable for SVM. Specifically, the PLS and SVM model performed better in the three modelings, whereas, the RF model performed the best, among which the combined OOB-RF model was the best, with the model determination coefficient of 0.742, RMSE of 206 kg/hm2, and nRMSE of 3.10%. Therefore, the index screenings varied greatly with the regression, where the OOB-RF model presented the best yield estimation in the study area. This finding can provide a strong theoretical reference to integrate the exponential screening and regression for the rice yield estimation model.

       

    /

    返回文章
    返回