Abstract:
Abstract: The field sample points can be directly input into the crop classification models using remote sensing. Therefore, the quantities and quality of sample points can dominate both the classification accuracy and mapping. In this study, a data-driven approach was established for sampling strategies using the features of spectral bands and vegetation indices from image classification. A field sample points approach was carried out to combine a few stratified random sampling, and then followed by the multiple evaluation metrics, according to the dependence of the crop remote sensing classification upon the varied sampling. A data-driven approach based on K-means unsupervised clustering was used to generate a graph of clustering with the same optimal K, considering 78 classification features extracted from the 6-phase Sentinel-2 images. The comparison experiments consisted of two intra-stratified sample allocation strategies with equal and area-ratio sample allocation, five total sample sizes of 25, 49, 100, 169 and 225, one theoretical total sample size of 139 and one traditional method of total sample size of 400. The accuracy of the mapping was also evaluated by the Support Vector Machine (SVM) classification model. The experimental results showed: (1) Sampling on the data-driven basemap generated by unsupervised clustering (area-ratio, and equal stratified sampling ) was obtained the better quality sample dataset, which was significantly higher classification accuracy than that without the basemap (simple random, and systematic sampling); (2) In cases where the total sample size was less than the theoretical total sample size, the equal stratified sampling performed better than the area-ratio stratified sampling. For example, when theoretical sample size was 139, mean accuracies of classification (75.5%, 80.5% and 86.0%) with the equal stratified sampling method at total sample sizes of 25, 49 and 100 was significantly higher than that with the area-ratio stratified sampling method (44.0%, 69.0% and 83.0%) while mean accuracies of classification with the two stratified methods at total sample sizes of 169 and 225 were all around 90.0%; (3) The actual total sample size by stratified sampling was smaller than the theoretical sample size, in order to fully meet the overall requirement of accuracy, indicating the great improvement in the sampling efficiency. For example, equal stratified sampling was required about one-seventh of the theoretical sample size to satisfy the overall accuracy requirement of 85.0%. The classification accuracy was equal to that of the manual selection (overall accuracy=97.5%), and the actual sample size of the equal stratified sampling was about one-ninth of the traditional one. Therefore, the classification accuracy and stability increased with the total sample size and then tended to saturate at the end, even if the sample size continued to increase. A well-balanced inter-class and diverse within-class sample set can be expected to obtain for an optical field sample distribution using crop remote sensing classification