基于多尺度语义编解码网络的遥感图像语义分割

梁燕; 易春霞; 王光宇; 胡跃辉

doi:10.12263/DZXB.20220503

您当前的位置：

首页 >

文章列表页 >

基于多尺度语义编解码网络的遥感图像语义分割

学术论文 | 更新时间：2025-12-08

- 基于多尺度语义编解码网络的遥感图像语义分割
- Semantic Segmentation of Remote Sensing Image Based on Multi-Scale Semantic Encoder-Decoder Network
- 电子学报 2023年51卷第11期页码：3199-3214
- 作者机构：
  
  1.重庆邮电大学通信与信息工程学院,重庆 400065
  2.信号与信息处理重庆市重点实验室,重庆 400065
  3.移动通信教育部工程研究中心,重庆 400065
- 作者简介：
  
  [ "梁燕女，1977年生，重庆人.于重庆邮电大学获硕士学位.现任重庆邮电大学通信与信息工程学院副教授、硕士生导师.主要研究方向为移动通信、物联网AI、图像处理. E-mail: liangyan@cqupt.edu.cn" ]
  [ "易春霞女，1996年生，重庆人.现为重庆邮电大学通信与信息工程学院硕士研究生.主要研究方向为计算机视觉、AI图像处理. E-mail: 1638362782@qq.com" ]
  [ "王光宇男，1964年生，贵州人.于德国基尔大学获博士学位.现任重庆邮电大学海外特聘教授，就职于德国英飞凌半导体公司.主要研究方向为5G/6G移动通信、AI人工智能. E-mail: wangguangyu@cqupt.edu.cn" ]
  [ "胡跃辉男，2002年生，重庆人.现为重庆邮电大学通信与信息工程学院本科生，参与重庆邮电大学本科生科研训练计划.主要研究方向为AI图像处理.E-mail: 2372230575@qq.com" ]
- 基金信息：
  
  国家自然科学基金(61702066);重庆市教委科学技术重点研究项目(KJZD-M201900601)
- DOI：10.12263/DZXB.20220503
  中图分类号： TP183;
- 收稿：2022-05-05，
  
  修回：2022-11-24，
  
  纸质出版：2023-11-25
- 稿件说明：
移动端阅览
梁燕,易春霞,王光宇等.基于多尺度语义编解码网络的遥感图像语义分割[J].电子学报,2023,51(11):3199-3214.

LIANG Yan,YI Chun-xia,WANG Guang-yu,et al.Semantic Segmentation of Remote Sensing Image Based on Multi-Scale Semantic Encoder-Decoder Network[J].ACTA ELECTRONICA SINICA,2023,51(11):3199-3214.
梁燕,易春霞,王光宇等.基于多尺度语义编解码网络的遥感图像语义分割[J].电子学报,2023,51(11):3199-3214. DOI： 10.12263/DZXB.20220503.

LIANG Yan,YI Chun-xia,WANG Guang-yu,et al.Semantic Segmentation of Remote Sensing Image Based on Multi-Scale Semantic Encoder-Decoder Network[J].ACTA ELECTRONICA SINICA,2023,51(11):3199-3214. DOI： 10.12263/DZXB.20220503.

摘要

针对遥感图像语义分割中存在的多层次信息提取和多尺度特征图上下文依赖性两个问题，本文分析现有处理方案，提出了一种综合运用多项技术的多尺度语义编解码网络（Multi-scale Semantic Encoder-Decoder Networks，MSEDNet）.MSEDNet由编码与解码两部分构成.编码阶段，首先提出残差协同空间注意（Residuals Coordinate Spatial Attention，RCSA）的MobileNetV3增强型模块，提取语义信息；其次，设计多层增强语义上下文模块（Enhance Semantic Context Module，ESCM），提升多尺度结构特征图的表征能力.解码阶段，首先提出多核卷积与Focus并行的强化空间细节信息模块（Strengthen Spatial Detail Information Module，SSDIM），增强浅层特征细节和结构信息；其次，设计了三元迭代多尺度特征融合（Triplet Iterative Multi-Scale Feature Fusion，TIMSFF）策略，强化图像深层全局语义信息与浅层局部细节特征的多尺度融合，提升分割精度.所提模型在ISPRS Vaihingen和Potsdam数据集上验证，总体分割精度（Overall Accuracy，OA）分别达到95.699%、95.534%，平均

-score（mean

-score，m

）分别提高2.661%和2.929%，且平均交并比（mean Intersection over Union，mIoU）分别增长3.973%和4.012%.所耗参数量Param下降至6.77 M.

Abstract

This paper analyzes the existed processing scheme

and proposes a multi-scale semantic encoder-decoder networks (MSEDNet) by comprehensively using multiple technologies for the problems in remote sensing image semantic segmentation both multi-level information extraction and multi-scale feature diagram dependence characteristic. The MSEDNet consists of two parts: encoding part and decoding part. In the encoding part

the enhanced MobileNetV3 with residuals coordinate spatial attention (RCSA) is firstly proposed to extract semantic informati

and then a multi-layer enhanced semantic context module (ESCM) is designed to improve representation ability of the multi-scale structure feature map. In the decoding part

a strengthen spatial detail information module (SSDIM) based on Multi-core Convolution and Focus Parallel is proposed to enhance the details and structural information of shallow features. Then triplet iterative multi-scale feature fusion (TIMSFF) strategy is designed to strengthen the multi-scale context fusion both deep global semantic information and shallow local detail features

for improving the segmentation accuracy. The proposed model has been experimentally verified on the ISPRS Vaihingen and Potsdam dataset. The overall segmentation accuracy (OA) reached 95.699% and 95.534% respectively

the mean

-score (m

) increased by 2.661% and 2.929% respectively

and the mean intersection over union (mIoU) increased by 3.973%and 4.012%

respectively. The number of param dropped to 6.77 M.

关键词

Keywords

references

SHOTTON J , JOHNSON M , CIPOLLA R . Semantic texton forests for image categorization and segmentation [C ] // 2008 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2008 : 1 - 8 .

ARBELÁEZ P , HARIHARAN B , GU C H , et al . Semantic segmentation using regions and parts [C ] // 2012 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2012 : 3378 - 3385 .

HUANG Q , XIA C Y , LI S Y , et al . Unsupervised clustering guided semantic segmentation [C ] // 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) . Piscataway : IEEE , 2018 : 1489 - 1498 .

LYU H R , FU H Y , HU X J , et al . Esnet: Edge-based segmentation network for real-time semantic segmentation in traffic scenes [C ] // 2019 IEEE International Conference on Image Processing (ICIP) . Piscataway : IEEE , 2019 : 1855 - 1859 .

WANG W H , XIE E Z , SONG X G , et al . Efficient and accurate arbitrary-shaped text detection with pixel aggregation network [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 8439 - 8448 .

DAI Y M , GIESEKE F , OEHMCKE S , et al . Attentional feature fusion [C ] // 2021 IEEE Winter Conference on Applications of Computer Vision (WACV) . Piscataway : IEEE , 2021 : 3559 - 3568 .

DENG G H , WU Z C , WANG C J , et al . CCANet: Class-constraint coarse-to-fine attentional deep network for subdecimeter aerial image semantic segmentation [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2022 , 60 : 1 - 20 .

YANG Q R , KU T , HU K Y . Efficient attention pyramid network for semantic segmentation [J ] . IEEE Access , 2021 , 9 : 18867 - 18875 .

HU J , SHEN L , ALBANIE S , et al . Squeeze-and-excitation networks [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2020 , 42 ( 8 ): 2011 - 2023 .

WOO S , PARK J , LEE J Y , et al . CBAM: Convolutional block attention module [C ] // European Conference on Computer Vision . Cham : Springer , 2018 : 3 - 19 .

HUANG Z L , WANG X G , HUANG L C , et al . CCNet: Criss-cross attention for semantic segmentation [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 603 - 612 .

HOU Q B , ZHOU D Q , FENG J S . Coordinate attention for efficient mobile network design [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 13708 - 13717 .

LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation [C ] /2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway : IEEE , 2015 : 3431 - 3440 .

RONNEBERGER O , FISCHER P , BROX T . U-net: Convolutional networks for biomedical image segmentation [C ] // International Conference on Medical Image Computing and Computer-Assisted Intervention . Cham : Springer , 2015 : 234 - 241 .

ZHOU Z W , SIDDIQUEE M M R , TAJBAKHSH N , et al . UNet++: A nested U-net architecture for medical image segmentation [EB/OL ] . ( 2018-07-18 )[ 2022-05-05 ] . https://arxiv.org/abs/1807.10165 https://arxiv.org/abs/1807.10165 .

HUANG H M , LIN L F , TONG R F , et al . UNet 3: A full-scale connected UNet for medical image segmentation [C ] // ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE , 2020 : 1055 - 1059 .

LI R , DUAN C X , ZHENG S Y , et al . MACU-net for semantic segmentation of fine-resolution remotely sensed images [J ] . IEEE Geoscience and Remote Sensing Letters , 2022 , 19 : 1 - 5 .

BADRINARAYANAN V , KENDALL A , CIPOLLA R . SegNet: A deep convolutional encoder-decoder architecture for image segmentation [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 12 ): 2481 - 2495 .

ISOBE S , ARAI S . Deep convolutional encoder-decoder network with model uncertainty for semantic segmentation [C ] // 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA) . Piscataway : IEEE , 2017 : 365 - 370 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .

IBTEHAZ N , RAHMAN M S . MultiResUNet: Rethinking the U-Net architecture for multimodal biomedical image segmentation [J ] . Neural Networks: the Official Journal of the International Neural Network Society , 2020 , 121 : 74 - 87 .

GUO C L , LI C Y , GUO J C , et al . Hierarchical features driven residual learning for depth map super-resolution [J ] . IEEE Transactions on Image Processing , 2019 , 28 ( 5 ): 2545 - 2557 .

LIU J Q , WANG Z L , CHENG K X . An improved algorithm for semantic segmentation of remote sensing images based on DeepLabv3+ [C ] // Proceedings of the 5th International Conference on Communication and Information Processing . New York : ACM , 2019 : 124 - 128 .

LI R , ZHENG S Y , DUAN C X , et al . Multistage attention ResU-net for semantic segmentation of fine-resolution remote sensing images [J ] . IEEE Geoscience and Remote Sensing Letters , 2022 , 19 : 1 - 5 .

CHEN L C , ZHU Y K , PAPANDREOU G , et al . Encoder-decoder with atrous separable convolution for semantic image segmentation [C ] // European Conference on Computer Vision . Cham : Springer , 2018 : 833 - 851 .

LI A J , JIAO L C , ZHU H , et al . Multitask semantic boundary awareness network for remote sensing image segmentation [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2022 , 60 : 1 - 14 .

ZHAO Q , LIU J H , LI Y W , et al . Semantic segmentation with attention mechanism for remote sensing images [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2021 , 60 : 1 - 13 .

LI R , ZHENG S Y , ZHANG C , et al . Multiattention network for semantic segmentation of fine-resolution remote sensing images [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2022 , 60 : 1 - 13 .

WU Z F , SHEN C H , VAN DEN HENGEL A . Wider or deeper: Revisiting the ResNet model for visual recognition [J ] . Pattern Recognition , 2019 , 90 : 119 - 133 .

PENG C , ZHANG X Y , YU G , et al . Large kernel matters—Improve semantic segmentation by global convolutional network [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 1743 - 1751 .

TAKIKAWA T , ACUNA D , JAMPANI V , et al . Gated-SCNN: Gated shape CNNs for semantic segmentation [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 5228 - 5237 .

CHEN L C , PAPANDREOU G , KOKKINOS I , et al . DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2018 , 40 ( 4 ): 834 - 848 .

ZHAO H S , SHI J P , QI X J , et al . Pyramid scene parsing network [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 6230 - 6239 .

BAI H W , CHENG J , HUANG X , et al . HCANet: A hierarchical context aggregation network for semantic segmentation of high-resolution remote sensing images [J ] . IEEE Geoscience and Remote Sensing Letters , 2022 , 19 : 6002105 .

HOWARD A , SANDLER M , CHEN B , et al . Searching for MobileNetV3 [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 1314 - 1324 .

WANG J F , CHEN Y , GAO M Y , et al . Improved YOLOv5 network for real-time multi-scale traffic sign detection [EB/OL ] . ( 2021-12-16 )[ 2022-05-05 ] . https://arxiv.org/abs/2112.08782 https://arxiv.org/abs/2112.08782 .

LIU R , MI L , CHEN Z Z . AFNet: Adaptive fusion network for remote sensing image semantic segmentation [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2021 , 59 ( 9 ): 7871 - 7886 .

HU X X , YANG K L , FEI L , et al . ACNET: Attention based network to exploit complementary features for RGBD semantic segmentation [C ] // 2019 IEEE International Conference on Image Processing (ICIP) . Piscataway : IEEE , 2019 : 1440 - 1444 .

NOGUEIRA K , DALLA MURA M , CHANUSSOT J , et al . Dynamic multicontext segmentation of remote sensing images based on convolutional networks [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2019 , 57 ( 10 ): 7503 - 7520 .

VOLPI M , TUIA D . Dense semantic labeling of subdecimeter resolution images with convolutional neural networks [J ] . IEEE Transactions on Geoscience and Remote Sensing , 2017 , 55 ( 2 ): 881 - 893 .

LI X , WANG W H , HU X L , et al . Selective kernel networks [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 510 - 519 .

LI D Q , HU X Q , WANG S Q , et al . Hyperspectral images ground object recognition based on split attention [C ] // 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE) . Piscataway : IEEE , 2021 : 324 - 330 .

ZHAO H S , QI X J , SHEN X Y , et al . ICNet for real-time semantic segmentation on high-resolution images [C ] // European Conference on Computer Vision . Cham : Springer , 2018 : 418 - 434 .

LI H F , QIU K J , CHEN L , et al . SCAttNet: Semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images [J ] . IEEE Geoscience and Remote Sensing Letters , 2021 , 18 ( 5 ): 905 - 909 .

MEHTA S , RASTEGARI M . MobileViT: Light-weight, general-purpose, and mobile-friendly vision transformer [EB/OL ] . ( 2021-10-05 )[ 2022-05-05 ] . https://arxiv.org/abs/2110.02178 https://arxiv.org/abs/2110.02178 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于邻域与超图协作的会话推荐

基于EIMYOLO的高分遥感图像目标检测

基于多重注意力和感知加权学习的单图像高动态范围重建

面向不同挑战及同异质信息分离的RGBT跟踪

基于分频式生成对抗网络的非成对水下图像增强