Abstract
Abstract: Landscape monitoring using high-resolution aerial images has become highly significant to achieve a sustainable development of land use in modern agriculture. However, only hyperspectral or multispectral images individually cannot meet the harsh requirements of high-precision classification for land use, due to the technical limitations of remote sensing sensor imaging, and the complex features of landscape images. The combined hyperspectral and multispectral remote sensing images can be expected to overcome the technical problems, when classifying land use using one single type of images. In this paper, a dual-branch convolutional neural network (DBCNN) was designed to integrate with hyperspectral and multispectral remote sensing images for land use classification. In hyperspectral images, 3D-1D convolutional neural network branch were used to automatically extract spatial-spectral features, whereas, 3D convolutional neural network branch was used in multispectral images. A fusion layer was designed to fuse the extracted features from hyperspectral and multispectral images. The land use category was finally output via the fully connected layer in the network. A dropout layer was added to the fully connected layer in the network, in order to avoid overfitting of imaging data. Image data sets were taken from two research areas in this study. One research area was in the rural and urban border area at Chikusei, Ibaraki, Japan, indicating mainly farm land in the whole regions. The hyperspectral and multispectral images were generated from the data set of Chikusei. The multispectral data was constituted three wavelength bands, corresponding to the center wavelength, (the 60th, 40th, and 20th bands of Chikusei data set), where the spatial resolution of multispectral image was 2.5m. Another research area was selected in Pavia University, USA, where the 53th, 31st, and 7th wavelength bands were set, while the spatial resolution was 1.3 min in multispectral data. A Gaussian down sampling method was used to obtain hyperspectral data. In the experiment, the obtained data set was divided randomly, where the ratio of training samples, validation samples and test samples were 25.5%, 4.5% and 70%, respectively. Five comparative experiments were conducted, including decision tree (DT), support vector machine (SVM), 1D CNN, 2D CNN and 3D CNN. A confusion matrix of each classification method was obtained, together with the overall accuracy, average accuracy, and kappa coefficient, by calculating the difference between the classification of comparative experiments and the land composition from actual field. In Chikusei data set, the overall accuracy of 3D-1D CNN was improved by 7.06, 5.72, and 2.90 percentage points, respectively, compared with DT, SVM and 1D CNN, indicating the highest accuracy of 3D-1D CNN in hyperspectral image classification. Compared with DT, SVM, and 2D-CNN, the overall accuracy of 3D CNN was improved by 6.16 percentage points, 0.38%, and 7.40% respectively, indicating the highest in the classification of multispectral remote sensing images. The experimental results show that the overall accuracy of fusion classification was constantly enhanced from shallow to deep learning algorithms. Compared with DT, SVM, 1D CNN, 2D CNN and 3D CNN, the overall accuracy of DBCNN has increased by 7.80, 5.05, 3.43, 2.60 and 0.90 percentage points, respectively. The classification of CNN in deep learning performed better than that of SVM and DT in shallow classification algorithms, especially on the identification of road traffic land, grassland, and water bodies. An optimum effect can be achieved, combined with the classification algorithm of land use and the dual-branch network-based hyper-multispectral imagery. The spatial and spectral features of hyperspectral and multispectral images can be used to enhance the model's convergence and generalization ability, and thereby to improve the accuracy of classification. The classification of water body, grassland, and habitation performed very well, while the misclassification of transportation land also significantly decreased, indicating the overall accuracy was 99.39%, and the Kappa coefficient was 0.992 0. In Pavia University data set, the classification effect of 1D CNN, 2D CNN and 3D CNN performed better than that of the shallow classification method, such as DT and SVM. In the DBCNN classification method, there was a clear distinction between the unused land and transportation land, while the misclassification of housing and construction land significantly decreased. Compared with other classification methods, DBCNN has the best classification effect. The accuracy of all classifications was above 90% in the data of Pavia University. Compared with DT, SVM, 1D CNN, 2D CNN and 3D CNN, the overall accuracy of DBCNN increased by 14.84, 6.75, 4.55, 5.34, 3.06 percentage points.