Method for estimating potential evapotranspiration by self-optimizing nearest neighbor algorithm
-
-
Abstract
The FAO-56 Penman-Monteith method for estimating potential evapotranspiration is widely used, but multiple meteorological data are required. In this study, the potential evapotranspiration calculation method (CCA-k-NN) of self-optimizing nearest neighbor algorithm combining the canonical correlation analysis algorithm and the k-nearest neighbor algorithm was proposed to estimate potential evapotranspiration by using less meteorological data. This study chose the northwest China as a case. In this area, the arid, semi-arid and semi-humid climates coexist, and the topography of the mountains, Gobi, oasis, and desert are intertwined, it is ecologically fragile, and highly sensitive to climate change. Meteorological data included daily average wind speed, daily average maximum temperature, daily average minimum temperature, daily average temperature, sunshine hours, daily average relative humidity of 148 meteorological stations. They were divided into training datasets, verification datasets and test datasets. On the spatial scale, 60% of all 148 meteorological sites (89 sites) were used as training data sets, 30% of sites were used as verification data sets (44 sites) and the remaining 10% of sites (15 sites) as the test dataset. On the time scale, the data of 1960-2018, the first 60% of the period (1960-1994) was as the training data set, the middle 30% of the year (1995-2012) was as the verification data set and the remaining 10% of the year (2013-2018) was as a test data set. For the training sample dataset, the most relevant meteorological elements in Northwest China with potential evapotranspiration were the highest temperature and relative humidity using typical correlation algorithms. Then, the highest temperature and relative humidity were used as input for the model. The optimal k value was selected by iteration and the results showed that the k value (15-32) of each weather station in northwestern China was suitable. Then, the verification data set and the test data set were respectively input with the highest temperature and relative humidity and the k nearest neighbor algorithm was used for potential evapotranspiration estimation. Models were evaluated by using relative deviation, root mean square error, mean absolute error, correlation coefficient and Nash-Sutcliffe efficiency coefficient. The results showed that the CCA-k-NN method maintained a high correlation with the FAO-56 Penman-Monteith (correlation coefficient greater than 0.9), with good estimation accuracy, and the root mean square error and the mean absolute error were less than 1 mm/d. On the spatial scale, the Nash efficiency coefficient of the algorithm was greater than 0.5, and the Nash efficiency coefficient on the time scale was greater than 0.8, which was applicable at both space and time scales. At the same time, the algorithm had low time complexity compared to other alternative methods, and could effectively reduce the time cost when calculating large amounts of data.
-
-