集成渐进式多特征融合的复杂场景耕地提取方法

    Multi-feature fusion for cropland extraction in complex scenes

    • 摘要: 针对复杂场景下耕地边界不清晰、地块提取不准确等问题,该研究提出了一种用于遥感影像耕地分割的渐进式多特征融合网络(multi feature progressive fusion unet,MPFUnet)。该方法采用Unet作为空间特征提取的主干网络,构建了包含空间注意力和边缘特征强化的多特征注意力模块,以分层异构融合的策略聚合了不同分辨率尺度下的空间注意力特征和多尺度边缘信息,增强对耕地边缘细节的捕捉能力。为了动态适应不同的特征尺度,在空间解码部分设计了一种渐进式特征增强结构,从网络较低层到较高层逐步捕捉并融合相邻尺度的空间特征,以保持不同层特征提取的注意力一致性,确保了复杂场景耕地的多尺度信息能够被充分利用。利用2m高分辨率影像提取了眉山市东坡区的耕地分布,结果表明MPFUnet的准确率、召回率、平均交互比、卡帕系数F1分数分别达到了92.54%、94.08%、84.32%、87.47%和89.11%。相较于基础Unet模型分别提升5.05%、4.78%、6.15%、7.37%和7.48%,同时也优于DeepLabv3+、YOLOV3和Swin-Unet等其他方法。该模型综合利用了空间纹理特征和边缘细节特征,在规则地块、破碎地块和复杂场景地块均表现良好,具有较高的准确性和较强的鲁棒性,为解决复杂场景的耕地提取提供了一个有效且可行的方案。

       

      Abstract: In the current complex and dynamic agricultural landscape, cropland fragmentation and the phenomena of non-agricultural and non-grain land use are widespread, posing significant challenges to the accurate extraction of cropland. To address issues such as unclear cropland boundaries and inaccurate plot extraction in complex scenarios, this paper proposes a Multi-Feature Progressive Fusion Unet (MPFUnet) for cropland segmentation in remote sensing images. This method fully leverages both spatial information and geometric edge information of cropland, using the Unet as the backbone network for spatial feature extraction and a multi-feature attention module is proposed to make up for the Unet network's lack of local subtle feature perception, which includes a spatial attention module and an edge feature enhancement module. The spatial attention module obtains global features by concatenating average pooling maps and maximum pooling maps across channel dimension firstly, and then obtains the spatial attention map through the activation function, which reflects the spatial importance of each pixel. While the edge feature enhancement module improves the perception of multi-scale spatial information by fusing multiple sets of receptive field features under different dilation rates. Based on them, a hierarchical heterogeneous fusion strategy is implemented by multiplying the attention map and multi-scale feature maps to better learn multi-dimensional feature map representation. Thus, the spatial attention features and multi-scale edge information are aggregated at different resolution scales to obtain edge-enhanced spatial feature maps. In order to dynamically adapt to different feature scales, a progressive feature enhancement (PFE) structure is designed in the spatial decoding part of the network, it embeds the spatial information of adjacent layers in each layer to further integrate the global semantic information of high-level features and detailed edge information of low-level features. Furthermore, A layer-by-layer integration approach is adopted to capture and fuse adjacent scale features, which follows the order from the low layer to the high layer of the network to maintain attention consistency between different feature extraction layers. The experiments used multi-source satellite images such as JL-1, GF-1, GF-2, and GF-7 as data sources. The 2-meter high-resolution domestic cropland dataset of Dongpo District, Meishan City, was randomly divided into training, validation, and test sets in a ratio of 3:1:1. The experimental results show that the MPFUnet achieved an accuracy of 92.54%, a recall rate of 94.08%, an average intersection over union (IoU) of 84.32%, and a Kappa coefficient of 87.47%. Compared to the baseline Unet model, these metrics were improved by 8.23%, 7.01%, 10.02%, and 11.33%, respectively. The results were also superior to other methods such as DeepLabv3+, YOLOv3, and Swin-Unet. All of the quantitative experimental outcomes and visual results manifest that the proposed model exhibits excellent performance on regular patches, fragment patches, and complex scene patches. Moreover, the model is capable of effectively integrating spatial texture characteristics and edge detail features, and thus can actively address crop segmentation tasks within diverse scenarios. Additionally, it possesses a robust capacity to accurately identify plot boundaries and small areas. Furthermore, it also demonstrates a high degree of robustness against interference factors in certain complex scenarios. Therefore, it presents an effective and viable solution for farmland exploration in complex scenarios.

       

    /

    返回文章
    返回