集成渐进式多特征融合的复杂场景耕地提取方法

常明会; 李世华; 彭帅峰; 刘志彬; 赵涛; 张友才; 林俊; 李军; 穆羽

doi:10.11975/j.issn.1002-6819.202409111

摘要: 针对复杂场景下耕地边界不清晰、地块提取不准确等问题，该研究提出了一种用于遥感影像耕地分割的渐进式多特征融合网络（multi feature progressive fusion unet，MPFUnet）。该方法采用Unet作为空间特征提取的主干网络，构建了包含空间注意力和边缘特征强化的多特征注意力模块，以分层异构融合的策略聚合了不同分辨率尺度下的空间注意力特征和多尺度边缘信息，增强对耕地边缘细节的捕捉能力。为了动态适应不同的特征尺度，在空间解码部分设计了一种渐进式特征增强结构，从网络较低层到较高层逐步捕捉并融合相邻尺度的空间特征，以保持不同层特征提取的注意力一致性，确保了复杂场景耕地的多尺度信息能够被充分利用。利用2 m高分辨率影像提取了眉山市东坡区的耕地分布，结果表明MPFUnet的准确率、召回率、平均交互比、卡帕系数F1分数分别达到了92.54%、94.08%、84.32%、87.47%和89.11%。相较于基础Unet模型分别提升5.05、4.78、6.15、7.37和7.48个百分点，同时也优于DeepLabv3+、YOLOV3和Swin-Unet等其他方法。该模型综合利用了空间纹理特征和边缘细节特征，在规则地块、破碎地块和复杂场景地块均表现良好，具有较高的准确性和较强的鲁棒性，为解决复杂场景的耕地提取提供了一个有效且可行的方案。

Abstract: Cropland fragmentation has posed significant challenges on the non-agricultural and non-grain land use in the complex and dynamic agricultural landscape in recent years. Therefore, the accurate plot extraction of cropland is also required for the unclear cropland boundaries in the complex scenarios. In this study, a Multi-Feature Progressive Fusion Unet (MPFUnet) was proposed for the cropland segmentation in the remote sensing images. Both spatial and geometric edge information of cropland was selected for the spatial feature extraction using the Unet as the backbone network. A multi-feature attention module was proposed to enhance the local subtle feature perception rather than the Unet network, including a spatial attention module and an edge feature enhancement module. Among them, the spatial attention module was used to obtain the global features. The average and the maximum pooling maps were firstly connected across channel dimension, Then the spatial attention map was obtained through the activation function, indicating the spatial importance of each pixel. While the edge feature enhancement module was used to improve the perception of multi-scale spatial information, in order to fuse the multiple sets of receptive field features under different dilation rates. A hierarchical heterogeneous fusion strategy was implemented to multiply the attention map and multi-scale feature maps, in order to better learn the multi-dimensional feature map. Thus, the spatial attention features and multi-scale edge information were aggregated at different resolution scales to obtain the edge-enhanced spatial feature maps. A progressive feature enhancement (PFE) structure was also designed to dynamically adapt at different feature scales in the spatial decoding part of the network. The spatial information of adjacent layers was embedded in each layer to further integrate the global semantic information of high-level features and detailed edge information of low-level features. Furthermore, the layer-by-layer integration approach was adopted to capture and fuse the adjacent scale features. The descending order from the low to the high layer of the network was maintained the attention consistency among different feature extraction layers. Multi-source satellite images were utilized in the experiments, such as JL-1, GF-1, GF-2, and GF-7 as the data sources. The 2-meter high-resolution cropland dataset was then collected from the Dongpo District, Meishan City, SiChuan Province, China. The dataset was randomly divided into training, validation, and test sets in a ratio of 3:1:1. The experimental results show that the MPFUnet was achieved in an accuracy of 92.54%, a recall rate of 94.08%, an average intersection over union (IoU) of 84.32%, and a Kappa coefficient of 87.47%, which were improved by 8.23%, 7.01%, 10.02%, and 11.33%, respectively, compared with the baseline Unet model. The excellent performance was also obtained on the regular, fragment and complex scene patches, superior to the rest models, such as DeepLabv3+, YOLOv3, and Swin-Unet. Moreover, the spatial texture and edge features were effectively integrated to realize the crop segmentation within diverse scenarios. Additionally, a robust capacity was also gained to accurately identify the plot boundaries and small areas. Furthermore, there was the high degree of robustness against interference factors in the complex scenarios. Therefore, an effective and viable solution can be offered to extract the farmland fragments in complex scenarios.

集成渐进式多特征融合的复杂场景耕地提取方法

Multi-feature progressive fusion for cropland extraction in complex scenes