集成Sentinel-1/2和环境变量的新疆农田土壤含盐量反演

    Inverting soil salinity of farmland in Xinjiang by integrating Sentinel-1/2 and environmental variables

    • 摘要: 土壤盐渍化是危害农业生产和生态环境的重要因素,快速精准获取农田土壤盐分信息对农业可持续发展和土地资源管理具有指导作用。为了提高卫星遥感在植被覆盖条件下的土壤含盐量预测精度,该研究以新疆生产建设兵团农八师为研究区域,分别在2023年7月和8月采集土壤表层(0~20 cm)样本,并获取同步卫星影像。通过Sentinel-1雷达信息、Sentinel-2多光谱信息与环境变量的不同组合构建数据集A(极化指数、光谱指数)、B(极化指数、环境变量)、C(光谱指数、环境变量)、D(极化指数、光谱指数、环境变量),采用自适应提升(adaptive boosting,AdaBoost)、梯度提升回归树(gradient boost regression tree,GBRT)和极端梯度提升树(extreme gradient boosting tree,XGBoost)3种集成学习算法,构建基于不同数据集的土壤含盐量反演模型。结果表明,1)基于数据集D构建的所有模型预测精度普遍高于数据集A、B、C构建的模型,环境变量与雷达数据、多光谱数据的协同使用可以有效提高模型精度;2)地形因子和地表温度可作为研究区土壤盐分预测的有效特征变量,其中海拔与表层土壤含盐量的相关性最高;3)在所有数据集中,XGBoost模型性能最优,GBRT次之,AdaBoost模型验证误差较大,其中D-XGBoost模型精度最高,其验证集决定系数为0.72,均方根误差为2.40 g/kg,平均绝对误差为1.29 g/kg;4)基于多种来源变量组合的集成学习算法具有强大的非线性拟合能力,XGBoost能够更好地模拟土壤含盐量与遥感信息和环境因子之间复杂的非线性关系,并获得理想的拟合结果。该研究结果可为新疆地区土壤含盐量实时动态监测和当地土地资源可持续利用提供有效的技术手段。

       

      Abstract: Soil salinization is an important factor that jeopardizes agricultural production and ecological environment. Rapid and accurate acquisition of soil salinity information in farmland is instructive for sustainable agricultural development and land resource management. In order to improve the accuracy of soil salinity prediction under vegetation cover conditions by satellite remote sensing, the eighth Agricultural Division of Xinjiang Production and Construction Corps was taken as the study area in this study. The soil surface (0-20 cm) samples were collected under high fractional vegetation cover conditions in July and August 2023, respectively, and synchronized satellite images were acquired. Sentinel-1, Sentinel-2 and environment variables provide 3 different types of explanatory variables. The dataset A (polarization indices, spectral indices), dataset B (polarization indices, environment variables), dataset C (spectral indices, environment variables), and dataset D (polarization indices, spectral indices, environment variables) were constructed separately from different combinations of Sentinel-1 radar information, Sentinel-2 multispectral information and environment variables. Then, three integrated machine learning algorithms, namely adaptive boosting (AdaBoost), gradient boost regression Tree (GBRT) and eXtreme gradient boosting tree (XGBoost), were applied to construct soil salinity inversion models based on different datasets. The results showed that Models constructed from dataset B (polarization indices and environmental variables) and C (spectral indices and environmental variables) achieved higher prediction accuracies compared to dataset A (polarization indices and spectral indices). It is shown that when environmental variables are involved in the prediction of soil salinity, the model effect is more effective than the model constructed by polarization and spectral indices suggesting that the model effects are more effective than those constructed from polarization and spectral indices. When environmental variables were applied to dataset D together with polarization indices and spectral indices, the prediction accuracy of all models constructed based on dataset D are generally higher than those constructed on dataset A, B, and C, and that the synergy of environmental variables with radar data and multispectral data can effectively improve the model accuracy. Radar information, spectral information and environmental variables are complementary in soil salinity prediction. Based on the correlation analysis, it can be seen that radar information, spectral information and environmental variables can be used as effective characteristic variables for soil salinity prediction in the study area. It was worth noting that the correlation between topographic factors and land surface temperature with soil salinity is relatively high, with the highest correlation between elevation and surface soil salinity (r = 0.52). Considering the spatial characteristics of soil salinity distribution in the study area can provide effective characteristic variables for soil salinity prediction under vegetation cover condition. In all datasets, the XGBoost had the best performance, followed by GBRT, and the AdaBoost had a large validation error. The D-XGBoost model having the highest accuracy with a validation set R2 of 0.72, an RMSE of 2.40 g/kg, and an MAE of 1.29 g/kg. The integrated learning algorithms based on the combination of multiple source variables has a strong nonlinear fitting ability. XGBoost can better model the complex nonlinear relationship between soil salinity content and remote sensing information, environmental factors, and obtain ideal fitting results. The joint application of multi-source remote sensing data and integrated learning algorithms can obtain the ideal soil salinity inversion accuracy under vegetation cover conditions. This study provides an effective technical means for real-time dynamic monitoring of soil salinity by satellite remote sensing in farmland to optimize irrigation strategies and manage saline soils comprehensively in Xinjiang.

       

    /

    返回文章
    返回