样点稀少条件下基于环境相似性的土壤有机碳空间分布预测

郭澎涛; 肖秀绒; 赵菊; 李茂芬; 李波; 傅奠基

doi:10.11975/j.issn.1002-6819.202401133

样点稀少条件下基于环境相似性的土壤有机碳空间分布预测

Predicting the spatial distribution of soil organic carbon using environmental similarity with limited samples

摘要

摘要: 针对现有土壤有机碳（soil organic carbon，SOC）空间分布预测模型难以适用于样点稀少条件下的问题，该研究依据成土环境越相似土壤属性越相似的假设，提出一种基于环境相似性的SOC空间分布预测方法（environmental similarity model，ESM），首先利用影响SOC空间分布的关键环境变量刻画研究区成土环境，然后比较采样点与待估测点处的环境相似度，最后依据环境相似度预测待估测点处的SOC含量。为验证ESM方法的有效性，以云南省作为案例研究区，并设置3个情景：1）从64个采样点中随机抽取10个点作为训练集，余下的采样点作为验证集，随机抽取20次；2）从64个采样点中随机抽取20个点作为训练集，余下的采样点作为验证集，随机抽取20次；3）从64个采样点中随机抽取30个点作为训练集，余下的采样点作为验证集，随机抽取20次。以平均绝对误差（mean absolute error，MAE）和均方根误差（root mean square error，RMSE）评估模型预测精度。方差分析结果表明，采样点分别为10、20和30这3个情景条件下ESM的MAE（12.7、11.7、11.1 g/kg）都显著（P < 0.05）低于多重线性回归（72.6、23.0、16.7 g/kg）和人工神经网络（15.8、14.9、15.8 g/kg），表明ESM模型具有较高的预测精度及较强的鲁棒性，可为成土因素复杂区域SOC空间分布的预测提供借鉴和指导。

Abstract: Soil organic carbon (SOC) has been one of the most important indicators for the soil quality and agricultural sustainability. Commonly-used models can generally require a large number of soil samples as training dataset in many cases, in order to predict the spatial distribution of SOC. However, the soil sampling is time-consuming, laborious and costly. Once only a limited number of soil samples is available, the existing models, such as multiple linear regression (MLR) and artificial neural networks (ANN), cannot well acquire the reliable relation between environmental variables and SOC, leading to the unsatisfactory prediction. In this study, an environmental similarity model (ESM) was proposed, according to the assumption that the more similar the soil forming environment was, the more similar the soil properties were. Three major steps were used to design the ESM: (1) To characterize the soil forming environment on the soil sampling and unvisited sites using key environmental variables that affected on the spatial distribution of SOC, (2) To assess the environmental similarity between the sampling and the unvisited sites, (3) To estimate the SOC content at the unvisited sites, according to the environmental similarity. A field test was carried out to verify the feasibility of the improved model. Taking Yunnan Province as the case study area, a total of 64 soil samples were derived from the World Soil Information Service (WoSIS) database. Three scenarios were then set using the total samples: (1) In the first scenario, 10 soil samples were randomly selected from the 64 soil samples as the training set, and the remaining 54 samples were used as test set. This selection was repeated for 20 times. As such, 20 groups were obtained for 10 training and 54 test samples. (2) In the second scenario, 20 samples were randomly chosen from the total samples as the training set, and the remaining 44 samples were employed as the test set. 20 groups were obtained as 20 training and 44 test samples. (3) In the third scenario, 30 samples were randomly derived from the total samples as the training set, and the remaining 34 samples were used as the test set. 20 groups were obtained as 30 training and 34 test samples. Two indices, namely mean absolute error (MAE) and root mean square error (RMSE), were used to measure the prediction accuracy of the three models. Analysis of variance (ANOVA) was applied to compare the prediction accuracy among the three models. The results showed that the MAE of ESM were 12.7, 11.7, and 11.1 g/kg, respectively, for the first, the second and the third scenarios, which were all significantly lower (P < 0.05) than those of MLR (72.6 g/kg, sample size n = 10; 23.0 g/kg, n = 20; 16.7 g/kg, n = 30) and ANN (15.8g /kg, n = 10; 14.9 g/kg, n = 20; 15.8 g/kg, n = 30). Therefore, the ESM was achieved in the high accuracy of prediction and strong robustness. The finding can provide a new way to predict the spatial distribution of SOC.

HTML全文

参考文献(34)

施引文献

资源附件(0)