基于空间LDA模型的高分辨率遥感影像地物覆盖分类

李杨; 邵华; 江南; 施歌; 丁远

doi:10.11975/j.issn.1002-6819.2018.08.023

摘要: 高空间分辨率遥感影像可以更为精确地分析地物覆盖类型，但空间分辨率的提高也对现有分类方法带来了新的挑战，隐狄利克雷分配（latent dirichlet allocation，LDA）模型能够建立遥感影像底层特征和高层语义之间的联系，但是当前LDA模型在遥感影像分析领域的应用以场景分类和图像检索为主，已有的一些关于土地覆盖分类研究缺少对高分辨率遥感数据的空间关系进行挖掘。该文在标准LDA模型的基础上，利用词包模型构建方法生成的影像文档和单词对象来展开试验，利用多尺度分割挖掘影像中对象的空间关系，设计"主题流行度"和"主题内容"2种形式的影像文档区域属性，以协变量的形式作为LDA模型的先验知识，提出一种空间LDA模型（Space-LDA）。利用无锡宜兴市QuickBird影像验证该模型在高分辨率遥感影像分类中的有效性，结果表明空间LDA模型分类结果不仅明显优于标准LDA，而且对区域尺度的变化具有一定的鲁棒性，空间区域信息同时从主题流行度和主题内容2个层面提供了推理信息，使模型具有更灵活的结构。

Abstract: Abstract: Automatic classification of high-resolution remote sensing image has been a hot topic in remote sensing image analysis and related areas. Object-oriented image analysis method exploits segmentation object as the basic unit of analysis, avoiding the "salt and pepper" phenomenon in the traditional classification method. However, the major limitation is the uncertainty of segmenting semantically meaningful object as the basic unit of analysis. The traditional classifiers to discriminate objects in the underlying feature spaces cannot adapt to the complexity of different features with different imaging mechanisms, leading to large differences within classes and the imbalance between different classes. Probabilistic topic model shows a great success in the field of natural language processing with solid mathematical theoretical foundation. Its inherent characteristics are extremely consistent with the demand of remote sensing information extraction. In this paper, the classic probabilistic topic model, latent dirichlet allocation (LDA) model, is used as the main model. LDA model is introduced into high-resolution remote sensing image classification research. Bag-of-words model can produce a valid information expression of remote sensing images, but the studies regarding this model are still limited in low-level features which mean the word vector space. The proposed LDA model in this paper would be applicable to the further exploitation of remote sensing data. This paper was going to establish some sort of spatial model oriented to the classification of high-resolution remote sensing images, and meanwhile the improved model was based on the traditional LDA model. The expression of remote sensing images based on the bag-of-words model is a basic of probabilistic topic model. Considering the requires of patterns classification, we were going to create the "Document-Words" mapping of high-resolution remote sensing images by using multi-scale segmentation algorithm and study on the key technology of building the bag-of-words model. It created the visual dictionary with a fast clustering algorithm based on the density peak to get rid of the dependence of original clustering center and identify the noise. After building the bag-of-words model, we tried to introduce the popularity and content of spatial topic as the latent topic of data and the prior distribution of words in the topic. Moreover, the variational expectation maximization (EM) inference algorithm of this model was built and the tests would verify the advantage of improved model. QuickBird images of Wuxi were used in the experiment, whose spatial resolution was 0.6 m with 4 bands. LDA and Space-LDA model were compared in the classification of land use types. Space-LDA model had higher classification accuracy than traditional LDA, and reached the highest accuracy when the visual dictionary size was 480. At last, when topic size was fixed at 40, both overall classification accuracy and Kappa's coefficient showed that Space-LDA model achieved better results than LDA model. The spatial region information provides reasoning information from both the theme popularity and the thematic content at the same time, so the model has a more flexible structure.

基于空间LDA模型的高分辨率遥感影像地物覆盖分类

Classification of land cover in high-resolution remote sensing images based on Space-LDA model