Random forest prediction model for the soil organic matter with optimized spectral inputs
-
-
Abstract
Abstract: Soil organic matter (SOM) is one of the most important parts of the soil carbon pool. The carbon-containing organic matter in soil mainly includes animal and plant residues, microorganisms, and various organic matter decomposed or synthesized in agriculture. Among them, the SOM content is one of the important indicators to measure the soil fertility level. Accurate measurement of SOM is of great significance for soil fertility evaluation, environmental protection, agricultural and forestry development. Therefore, accurate prediction of SOM content is extremely important so far. Previous research on the SOM prediction of random forest (RF) usually only uses one spectral input without considering the complementarity between different spectral inputs. Therefore, it is necessary to select an appropriate method for the noise reduction of the reflection spectrum, in order to reduce the influence of spectral noise. Among them, discrete wavelet transform can be used to reduce high spectral noise, while preserving the effective information for the SOM prediction. In this study, the combination of different spectral inputs and discrete wavelet transform was used to predict the SOM with the optimized spectral input using RF. The stochastic forest model was also used to predict the SOM. Firstly, the original spectral reflectance of 204 soil samples from Baoqing County was analyzed using discrete wavelet transform. Secondly, the spectral characteristic parameters and principal components were extracted from the decomposed characteristic spectral curves, in order to construct the spectral indices. Finally, the three spectral inputs were substituted into the RF model to explore the optimal combination of spectral inputs for the SOM prediction. Meanwhile, the variation trend of different spectral inputs was obtained under different wavelet decomposition scales, in order to provide a new idea for the selection of spectral inputs for the SOM hyperspectral prediction. The RF model was better to predict the SOM in this case. The optimal combination of different spectral information was obtained to predict the organic matter and the optimal decomposition scale of the discrete wavelet transform. Finally, the combination with the highest accuracy was obtained among all the inputs at all decomposition scales. The results show that: 1) The accuracy of SOM prediction under different spectral inputs was higher than that of direct spectral reflectance modeling. The highest verification accuracy of the principal component in the single spectral index, similar to the combination of spectral characteristic parameters and principal component, was higher than that of the principal component modeling alone, indicating that combining different spectral inputs improved the prediction accuracy. However, simply stacking spectral inputs was not enough to improve the prediction accuracy. 2) There was also a different variation trend of prediction accuracy of different spectral inputs, with the increase of decomposition scale. The variation trend of prediction accuracy of different spectral input combinations was changed with the different spectral inputs in the combination, indicating the variation characteristics of spectral inputs. 3) The highest verification accuracy was found in the combination of the spectral characteristic parameters and principal components with the decomposition scale of 6, R2 reaching 0.78, and RMSE reaching 1.32%, indicating an excellent prediction ability. Anyway, it is feasible to predict the organic matter using the spectral input combined with discrete wavelet transform modeling. The finding can provide a reliable idea and theoretical support for the dynamic monitoring of SOM under temporal and spatial changes.
-
-