

浏览全部资源
扫码关注微信
1.武汉大学计算机学院,湖北武汉 430072
2.香港科技大学电子与计算机工程系,香港 999077
3.山东大学软件学院,山东济南 250101
Received:06 June 2025,
Accepted:09 October 2025,
Published:25 October 2025
移动端阅览
吕亮, 兰杰, 兰猛, 等. 基于跨视图上下文感知的高分辨率遥感图像半监督语义分割方法[J]. 电子学报, 2025, 53(10): 3744-3758.
LÜ Liang, LAN Jie, LAN Meng, et al. Cross-View Context-Aware Semi-Supervised Semantic Segmentation for High-Resolution Remote Sensing Images[J]. Acta Electronica Sinica, 2025, 53(10): 3744-3758.
吕亮, 兰杰, 兰猛, 等. 基于跨视图上下文感知的高分辨率遥感图像半监督语义分割方法[J]. 电子学报, 2025, 53(10): 3744-3758. DOI:10.12263/DZXB.20250483
LÜ Liang, LAN Jie, LAN Meng, et al. Cross-View Context-Aware Semi-Supervised Semantic Segmentation for High-Resolution Remote Sensing Images[J]. Acta Electronica Sinica, 2025, 53(10): 3744-3758. DOI:10.12263/DZXB.20250483
高分辨率遥感图像的半监督语义分割旨在利用少量标注样本与大量未标注样本联合训练,从而提升语义分割模型的性能.此种方法在显著降低人工标注成本的同时,能够充分挖掘未标注数据的潜在价值.现有方法通常采用将高分辨率遥感图像裁剪为多个子视图的方式进行训练,主要聚焦于同一视图在不同扰动条件下预测结果的一致性.然而,这类策略往往忽略了不同视图之间的语义与空间关联,限制了模型在标注数据不足时对遥感图像更广泛上下文信息的学习能力.为此,本文提出了一种基于跨视图上下文感知的高分辨率遥感图像半监督语义分割方法,该方法通过显式建模跨视图之间的上下文交互关系,有效提升伪标签的质量,并引入多重跨视图一致性约束机制,以在更广泛的上下文环境中保持预测结果的一致性.具体而言,本方法在训练过程中从原始高分辨率遥感图像中采样多个具有重叠区域的视图,包括一个主视图和若干上下文视图,并将这些视图同时输入模型.进一步设计了空间感知交互融合模块(Spatial-aware Interaction Fusion,SIF),该模块通过交叉注意力与自注意力机制,对不同视图的特征进行交互与融合,生成空间注意力激活图,从而自适应地融合各视图的预测结果,提升伪标签的准确性.同时,本文提出了多重跨视图上下文一致性约束(Cross-View Context Consistency,CVCC),通过匹配重叠区域的空间位置关系,约束多个视图在重叠区域中的预测结果趋于一致,增强模型对跨视图上下文信息的感知与建模能力,避免因视角变化引发的语义歧义. 为全面评估所提方法的性能,本文基于国际摄影测量与遥感学会提供的Vaihingen与Potsdam遥感图像语义分割数据集,设置了多种标注比例并进行系统性实验.实验结果表明,所提出的方法在多种标注比例下均显著优于现有主流半监督语义分割方法.特别是在仅使用一张标注图像的低标注设定下,相较于监督训练的基线模型,本文方法在Vaihingen和Potsdam数据集上分别实现了6.84%和12.73%的mIoU提升,充分验证了其在低标注条件下的卓越性能与强泛化能力.
Semi-supervised semantic segmentation of high-resolution remote sensing images aims to leverage a small number of labeled samples together with a large amount of unlabeled data for joint training
thereby enhancing the performance of semantic segmentation models
as this approach not only significantly reduces the cost of manual annotation but also fully exploits the potential value of unlabeled data. Existing methods typically divide high-resolution remote sensing images into multiple sub-views for training
focusing primarily on enforcing prediction consistency under different perturbations of the same view. However
such strategies often overlook the semantic and spatial relationships between different views
limiting the model’s ability to learn broader contextual information when labeled data are scarce. To address this issue
this paper proposes a cross-view context-aware semi-supervised semantic segmentation method for high-resolution remote sensing images. The proposed approach explicitly models the contextual interactions among multiple views to improve the quality of pseudo labels and introduces a multi-level cross-view consistency constraint to maintain prediction consistency within a broader contextual scope. Specifically
during training
multiple overlapping views—including a primary view and several contextual views—are sampled from the original high-resolution image and jointly fed into the model. A spatial-aware interaction fusion (SIF) module is designed to perform cross-view feature interaction and fusion via cross-attention and self-attention mechanisms. This module generates spatial attention activation maps that adaptively fuse the predictions from different views
thereby improving pseudo label accuracy. In addition
a multiple cross-view context consistency (CVCC) mechanism is introduced to enforce consistent predictions in overlapping regions by aligning their spatial correspondences. This constraint enhances the model’s ability to perceive and model cross-view contextual information
mitigating semantic ambiguity caused by view variations. To comprehensively evaluate the proposed method
extensive experiments are conducted on the Vaihingen and Potsdam datasets provided by the International Society for Photogrammetry and Remote Sensing
under various labeling annotation ratios. Results show that the proposed method consistently outperforms state-of-the-art semi-supervised segmentation approaches. In particular
under an extremely low-label setting using only one labeled image
it achieves 6.84% and 12.73% mIoU improvements over the supervised baseline on Vaihingen and Potsdam
respectively
validating its superior performance and strong generalization under limited annotation.
ZHANG L F , ZHANG L P . Artificial Intelligence for Remote Sensing Data Analysis: A review of challenges and opportunities [J ] . IEEE Geoscience and Remote Sensing Magazine , 2022 , 10 ( 2 ): 270 - 294 .
梁燕 , 易春霞 , 王光宇 , 等 . 基于多尺度语义编解码网络的遥感图像语义分割 [J ] . 电子学报 , 2023 , 51 ( 11 ): 3199 - 3214 .
LIANG Y , YI C X , WANG G Y , et al . Semantic segmentation of remote sensing image based on multi-scale semantic encoder-decoder network [J ] . Acta Electronica Sinica , 2023 , 51 ( 11 ): 3199 - 3214 . (in Chinese)
VAN ENGELEN J E , HOOS H H . A survey on semi-supervised learning [J ] . Machine Learning , 2020 , 109 ( 2 ): 373 - 440 .
QIAO S Y , SHEN W , ZHANG Z S , et al . Deep Co-training for semi-supervised image recognition [M ] // Computer Vision - ECCV 2018 . Cham : Springer International Publishing , 2018 : 142 - 159 .
WANG D , ZHANG X Q , FAN M Y , et al . Semi-supervised dictionary learning via structural sparse preserving [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2016 , 30 ( 1 ): 2137 - 2144 .
YANG L H , ZHUO W , QI L , et al . ST++: Make self-trainingWork better for semi-supervised semantic segmentation [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2022 : 4258 - 4267 .
TARVAINEN A , VALPOLA H . Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results [C ] // Advances in Neural Information Processing Systems . San Diego : NeurIPS , 2017 : 4268 - 4277 .
ZHANG Y X , LI W , SUN W D , et al . Single-source domain expansion network for cross-scene hyperspectral image classification [J ] . IEEE Transactions on Image Processing , 2023 , 32 : 1498 - 1512 .
ZHANG Y X , LI W , JIA W , et al . Cross-domain hyperspectral image classification based on bi-directional domain adaptation [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2025 , 35 ( 12 ): 12038 - 12051 .
WANG J X , CHEN S B , DING C H Q , et al . Semi-supervised semantic segmentation of remote sensing images with iterative contrastive network [J ] . IEEE Geoscience and Remote Sensing Letters , 2022 , 19 : 2504005 .
XU Y Z , YAN L L , JIANG J . EI-HCR: An efficient end-to-end hybrid consistency regularization algorithm for semisupervised remote sensing image segmentation [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2023 , 61 : 4405015 .
HUANG W , SHI Y L , XIONG Z T , et al . Decouple and weight semi-supervised semantic segmentation of remote sensing images [J ] . ISPRS Journal of Photogrammetry and Remote Sensing , 2024 , 212 : 13 - 26 .
SOHN K , BERTHELOT D , CARLINI N , et al . FixMatch: Simplifying semi-supervised learning with consistency and confidence [C ] // Advances in Neural Information Processing Systems 33 . San Diego : NeurIPS , 2020 : 596 - 608 .
YANG L H , QI L , FENG L T , et al . Revisiting weak-to-strong consistency in semi-supervised semantic segmentation [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 7236 - 7246 .
SUN B Y , YANG Y Q , ZHANG L , et al . CorrMatch: Label propagation via correlation matching for semi-supervised semantic segmentation [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 3097 - 3107 .
WANG H N , ZHANG Q X , LI Y , et al . AllSpark: Reborn labeled features from unlabeled in transformer for semi-supervised semantic segmentation [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 3627 - 3636 .
LV L , ZHANG L F . ScaleMatch: Multi-scale consistency enhancement for semi-supervised semantic segmentation [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2025 , 39 ( 6 ): 5910 - 5918 .
WANG J X , CHEN S B , DING C H Q , et al . RanPaste: Paste consistency and pseudo label for semisupervised remote sensing image semantic segmentation [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2022 , 60 : 2002916 .
JIN J D , LU W X , YU H F , et al . Dynamic and adaptive self-training for semi-supervised remote sensing image semantic segmentation [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2024 , 62 : 5639814 .
CAI M X , CHEN H , ZHANG T , et al . Consistency regularization based on masked image modeling for semisupervised remote sensing semantic segmentation [J ] . IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , 2024 , 17 : 17442 - 17460 .
XIN Y , FAN Z D , QI X Y , et al . Confidence-weighted dual-teacher networks with biased contrastive learning for semi-supervised semantic segmentation in remote sensing images [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2024 , 62 : 5614416 .
LI Z H , CHEN H , WU J J , et al . SegMind: Semisupervised remote sensing image semantic segmentation with masked image modeling and contrastive learning method [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2023 , 61 : 4408917 .
WANG Z C , ZHAO Z , XING X X , et al . Conflict-based cross-view consistency for semi-supervised semantic segmentation [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 19585 - 19595 .
FU Y J , WANG M Y , VIVONE G , et al . An alternating guidance with cross-view teacher-student framework for remote sensing semi-supervised semantic segmentation [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2025 , 63 : 4402012 .
LIU R Z , LUO T Z , HUANG S G , et al . CrossMatch: Cross-view matching for semi-supervised remote sensing image segmentation [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2024 , 62 : 5650515 .
LAI X , TIAN Z T , JIANG L , et al . Semi-supervised semantic segmentation with directional context-aware consistency [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2021 : 1205 - 1214 .
DANG B , LI Y S , ZHANG Y J , et al . Progressive learning with cross-window consistency for semi-supervised semantic segmentation [J ] . IEEE Transactions on Image Processing , 2024 , 33 : 5219 - 5231 .
兰猛 , 张乐飞 , 杜博 , 等 . 基于时空层级查询的指代视频目标分割 [J ] . 中国科学: 信息科学 , 2024 , 54 ( 3 ): 674 - 691 .
LAN M , ZHANG Z , DU B , et al . Spatio-temporal hierarchical query for referring video object segmentation [J ] . Scientia Sinica (Informationis) , 2024 , 54 ( 3 ): 674 - 691 . (in Chinese)
ROTTENSTEINER F , SOHN G , JUNG J , et al . The isprs benchmark on urban object classification and 3d building reconstruction [J ] . ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , 2012 , I-3: 293- 298 .
0
Views
3
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621