Abstract:
Abstract: With the continuous development and wide application of multivariate statistical analysis methods, more and more spectral pre-processing and modeling methods are used to analyze the spectral data in order to establish high-precision hyperspectral prediction models. This study selected soil samples from National Long-term (more than 20 years) Monitoring Station of Fluvo-aquic Soil Fertility and Fertilizer Effects. The soil can represent Huang-Huai-Hai Fluvo-aquic soil type and fertilization models. A total of 83 soil samples were collected from the depth of 0-20 cm from treatments with different ratio of fertilization. Reflectance measurements from 350 nm to 2 500 nm were obtained using FieldSpec 3 Hi Spectroradiometer (Analytical Spectral Devices Inc.) in laboratory after soils were air-dried and sieved (0.18 mm). Twenty five pre-processing methods including 15 single pre-processing methods (standard normal variate transformation、normalization、multiple scatter correction、derivative method with different smoothing points and operational parameters) and 10 pre-processing methods adding operations of spectral data and three multivariate techniques (stepwise multiple linear regression, SMLR,partial least-squares regression, PLSR,support vector machine regression, SVMR) were compared with the aim of identifying the best combination to predict fluvo-aquic soil organic matter content. The coefficient of determination、the root mean square error (RMSEv) and relative prediction deviation (RPD) of validation set were used to evaluate the models. The result showed that the best multivariate technique was PLSR, which associated with variety pre-processing methods could resulted in high accuracy and reliability of models. The averaged coefficient of determination、RMSEv and RPD of 25 prediction methods were 0.913、1.264 g/kg and 3.299 respectively. The optimal pre-processing method varied with the multivariate technique used. Compared with the single pre-processing methods, pre-processing methods add operations were better for data preparation among the 3 multivariate techniques, of which average coefficient of determination was higher 0.049、0.033and 0.071 than the single ones, respectively, and the average RPD was higher 0.530、0.307 and 1.144 than the single ones, respectively, but the average RMSEv lower 0.318、0.204 and 0.528 g/kg than the single ones , respectively. The optimal pre-processing method was multiple scatter correction added Savitzky-Golay 1st derivative with a search window of 5 measurements(MSC-SGF5-2) since it performed best among the 3 multivariate techniques, with the average coefficient of determination=0.934, RMSEv=1.17 g/kg and RPD=3.59. This pre-processing method probably can be used as a common spectral data preparation method for fluvo-aquic soil organic matter content prediction model. Among the tested models,the best prediction model for fluvo-aquic soil organic matter was PLSR multivariate techniques associated with normalization by the maximum value pre-processing method (coefficient of determination=0.948, RMSEv=0.972 g/kg, RPD=4.276), and it has high accuracy, reliability and was easy to operate.