Abstract:
Abstract: In the industrial and mining land reclamation area, the strong topographic relief, the diversity, breakage, mixed distribution and scattered layout of the surface features and other factors cause the difficulties for remote-sensing image classification mapping. In order to improve the classification accuracy for land use of industrial and mining reclamation area and provide data support for land reclamation monitoring and supervision, this article explored the classification method based on grid-search and random forest algorithm for the reclamation area. Satellite and auxiliary dataset including GF-1 images, DEM (digital elevation model) and field investigation data were acquired in October 2016. The study area was Gulin County, Luzhou City, Sichuan Province. In order to obtain the real surface reflectance and reduce the atmospheric and environmental effects from the satellite images in this study, FLAASH atmospheric correction and geometric correction were used in the satellite image pre-processing with ENVI 5.3 software. A machine learning algorithm, random forest algorithm, was used because the method facilitated the use of ancillary data in classification. Feature selection was an important preprocessing step in many machine learning applications, which selected the smallest subset of relevant features that built robust learning models. In the paper, spectrum, topography, texture and space variables were included in feature selection, in order to differentiate the built-up areas and farmlands, and BCI (biophysical composition index) was calculated in spectrum features. Texture feature processing comprised principal component analysis. Local Moran' I reflecting spatial autocorrelation feature and Local Getis Ord Gi reflecting hotspot feature were selected to improve the result of classification further. The grid-search method based on OOB (Out-of-Bag) error was used to optimize parameter. Based on data image spectrum, topography, texture, space and other information, 33 feature variables were figured out from the feature selection step, and 4 combined models were constructed to carry out random forest classification experiment; and the precision was 82.79%, 84.91%, 86.75% and 88.16% respectively. To eliminate the redundant information in the 33 feature variables and reduce the image band dimensionality, the study adopted variable importance estimation and Relief F algorithm to select the principle feature variables to conduct classification according to random forest algorithm. Through the comparison between the Model 2, Model 4, SVM (support vector machine) and MLC (maximum likelihood classification) classification result respectively, the study indicates that the random forest algorithm based on grid-search parameter optimization can achieve the classification accuracy of 88.16% in the multi-feature variables frame. After different methods are used to reduce the dimension of variables, the classification accuracy can also be kept above 85%, and the accuracy is higher than SVM and MLC classification results under the same number of feature variables. The random forest classifier is superior to SVM and more capable of dealing with multidimensional characteristic variables. The random forest method based on grid-search can obtain high precision in land use classification applied in reclamation area. Based on this method, remote sensing image interpretation can well provide the technical support and rational reference for land reclamation monitoring and supervision.