基于深监督跨尺度注意力网络的深度图像超分辨率重建

李滔; 董秀成; 林宏伟

doi:10.12263/DZXB.20210659

您当前的位置：

首页 >

文章列表页 >

基于深监督跨尺度注意力网络的深度图像超分辨率重建

学术论文 | 更新时间：2025-07-02

- 基于深监督跨尺度注意力网络的深度图像超分辨率重建
- Depth Map Super-Resolution Reconstruction Based on Deeply Supervised Cross-Scale Attention Network
- 电子学报 2023年51卷第1期页码：128-138
- 作者机构：
  
  1.西华大学电气与电子信息学院，四川成都 610039
  2.西北民族大学电气工程学院，甘肃兰州 730000
- 作者简介：
  
  [ "李　滔　女，1983年8月出生，四川资阳人.分别于2005年、2008年和2017年在四川大学获得学士、硕士和博士学位.现为西华大学副教授、硕士生导师.主要从事数字图像处理及计算机视觉方面的研究工作.E-mail: lucia634@163.com" ]
  [ "董秀成男，1963年4月出生，陕西咸阳人.分别于1985年和1990年在重庆大学获得学士和硕士学位.现为西华大学教授、硕士生导师.主要从事现代控制理论及机器人方面的研究工作." ]
  林宏伟　男，1983年2月出生.于2019年在四川大学获得博士学位.现为西北民族大学副教授、硕士生导师.主要从事数字图像处理、视频压缩及通信方面的研究工作.
- 基金信息：
  
  国家自然科学基金(61901392;62041109);四川省科技计划(2021YJ0109;2021ZYD0034)
- DOI：10.12263/DZXB.20210659
  中图分类号： TP751.1;TP183
- 收稿：2021-05-24，
  
  修回：2022-05-17，
  
  纸质出版：2023-01-25
- 稿件说明：
移动端阅览
李滔,董秀成,林宏伟.基于深监督跨尺度注意力网络的深度图像超分辨率重建[J].电子学报,2023,51(01):128-138.

LI Tao,DONG Xiu-cheng,LIN Hong-wei.Depth Map Super-Resolution Reconstruction Based on Deeply Supervised Cross-Scale Attention Network[J].ACTA ELECTRONICA SINICA,2023,51(01):128-138.
李滔,董秀成,林宏伟.基于深监督跨尺度注意力网络的深度图像超分辨率重建[J].电子学报,2023,51(01):128-138. DOI： 10.12263/DZXB.20210659.

LI Tao,DONG Xiu-cheng,LIN Hong-wei.Depth Map Super-Resolution Reconstruction Based on Deeply Supervised Cross-Scale Attention Network[J].ACTA ELECTRONICA SINICA,2023,51(01):128-138. DOI： 10.12263/DZXB.20210659.

摘要

消费级深度相机拍摄的深度图像具有分辨率较低的问题，深度图像超分辨率重建是解决该问题的有效方法.为了提高重建性能，提出一种基于深监督跨尺度注意力网络的深度图像超分辨率重建算法.网络逐级放大，在损失函数中对每一级的输出都进行约束，实现深监督的目的.采用高阶跨尺度注意力模块，将多尺度特征尺度内及跨尺度相关性与注意力机制结合起来，实现多尺度特征的自适应调整.采用内层为宽激活残差、外层为基本残差的双层残差块作为网络基本构成元素，以提高网络对复杂非线性关系的学习能力.实验结果表明，本文算法在主观视觉效果和客观质量评价指标方面都优于当前主流的深度图像超分辨率重建算法.

Abstract

Depth maps captured by consumer depth cameras usually suffer from low spatial resolution. Depth map super-resolution (SR) is an effective method to solve this problem. To improve the reconstruction performance

this paper proposes a depth map super-resolution reconstruction algorithm based on deeply supervised cross-scale attention network. A multi-stage up-sampling strategy is introduced. The loss function of the network contains the constraint on the output of each stage for a deep supervision. A high-order cross-scale attention block is proposed to adaptively adjust multi-scale features by integrating the in-scale and cross-scale correlations of multi-scale features with the attention mechanism. A bilayer residual block

which contains inner wide-activated residual learning and outer basic residual learning

is used as the basic component of network for more powerful ability of complex non-linear relationship learning. Experimental results demonstrate the superiority of the proposed algorithm over several state-of-the-art depth map SR methods in terms of visual comparison and quantitative evaluation.

关键词

Keywords

references

HORNACEK M , RHEMANN C , GELAUTZ M , et al . Depth super resolution by rigid body self-similarity in 3 d[C]// IEEE Conference on Computer Vision and Pattern Recognition . Portland : IEEE , 2013 : 1123 - 1130 .

LEI J , LI L , YUE H , et al . Depth map super- resolution considering view synthesis quality [J]. IEEE Transactions on Image Processing , 2017 , 26 ( 4 ): 1732 - 1745 .

MAC A O , CAMPBELL N D F , NAIR A , et al . Patch based synthesis for single depth image super-resolution [C]// European Conference on Computer Vision . Berlin : Springer , 2012 : 71 - 84 .

XIE J , FERIS R S , SUN M T . Edge-guided single depth image super resolution [J]. IEEE Transactions on Image Processing , 2015 , 25 ( 1 ): 428 - 438 .

XIE J , FERIS R S , YU S S , et al . Joint super resolution and denoising from a single depth image [J]. IEEE Transactions on Multimedia , 2015 , 17 ( 9 ): 1525 - 1537 .

MANDAL S , BHAVSAR A , SAO A K . Depth map restoration from undersampled data [J]. IEEE Transactions on Image Processing , 2016 , 26 ( 1 ): 119 - 134 .

RIEGLER G , et al . Atgv-net: Accurate depth super-resolution [C]// European Conference on Computer Vision . Cham : Springer , 2016 : 268 - 284 .

SONG X , DAI Y , QIN X . Deeply supervised depth map super-resolution as novel view synthesis [J]. IEEE Transactions on Circuits and Systems for Video Technology , 2018 , 29 ( 8 ): 2323 - 2336 .

HUANG L , ZHANG J , ZUO Y , et al . Pyramid-structured depth MAP super-resolution based on deep dense-residual network [J]. IEEE Signal Processing Letters , 2019 , 26 ( 12 ): 1723 - 1727 .

SONG X , DAI Y , ZHOU D , et al . Channel attention based iterative residual learning for depth map super-resolution [C]// IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 5631 - 5640 .

HUI T W , LOY C C , TANG X . Depth map super-resolution by deep multi-scale guidance [C]// European Conference on Computer Vision . Cham : Springer , 2016 : 353 - 369 .

KIECHLE M , HAWE S , KLEINSTEUBER M . A joint intensity and depth cosparse analysis model for depth map super-resolution [C]// IEEE International Conference on Computer Vision . Piscataway : IEEE , 2013 : 1545 - 1552 .

GUO C , LI C , GUO J , et al . Hierarchical features driven residual learning for depth map super-resolution [J]. IEEE Transactions on Image Processing , 2018 , 28 ( 5 ): 2545 - 2557 .

ZUO Y , WU Q , FANG Y , et al . Multi-scale frequency reconstruction for guided depth map super-resolution via deep residual network [J]. IEEE Transactions on Circuits and Systems for Video Technology , 2019 , 30 ( 2 ): 297 - 306 .

LI Y , HUANG J B , AHUJA N , YANG M H . Joint image filtering with deep convolutional networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2019 , 41 ( 8 ): 1909 - 1923 .

LI T , LIN H , DONG X , et al . Depth image super-resolution using correlation-controlled color guidance and multi-scale symmetric network [J]. Pattern Recognition , 2020 , 107 : 107513 .

LI T , DONG X , LIN H . Guided depth map super-resolution using recumbent Y network [J]. IEEE Access , 2020 , 8 : 122695 - 122708 .

ZHANG Y , LI K , LI K , et al . Image super-resolution using very deep residual channel attention networks [C]// European Conference on Computer Vision . Munich : Springer , 2018 : 286 - 301 .

DAI T , CAI J , ZHANG Y , et al . Second-order attention network for single image super-resolution [C]// IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2019 : 11065 - 11074 .

HU Y , LI J , HUANG Y , et al . Channel-wise and spatial feature modulation network for single image super-resolution [J]. IEEE Transactions on Circuits and Systems for Video Technology , 2019 , 30 ( 11 ): 3911 - 3927 .

ZHANG Y , LI K , LI K , et al . Residual non-local attention networks for image restoration [C]// International Conference on Learning Representations . New Orleans : Brown Walker Press , 2019 .

MEI Y , FAN Y , ZHOU Y , et al . Image super-resolution with cross-scale non-local attention and exhaustive self-exemplars mining [C]// IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 5690 - 5699 .

NAJIBI M , SAMANGOUEI P , CHELLAPPA R , et al . SSH: Single stage headless face detector [C]// IEEE International Conference on Computer Vision . Piscataway : IEEE , 2017 : 4875 - 4884 .

LIU Y , CHENG M M , HU X , et al . Richer convolutional features for edge detection [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2018 , 41 ( 8 ): 1939 - 1946 .

CHEN L C , ZHU Y , PAPANDREOU G , et al . Encoder-decoder with atrous separable convolution for semantic image segmentation [C]// European Conference on Computer Vision . Munich : Springer , 2018 : 801 - 818 .

李雅倩 , 盖成远 , 肖存军 , 等 . 基于细化多尺度深度特征的目标检测网络 [J]. 电子学报 , 2020 , 48 ( 12 ): 2360 - 2366 .

LI Y , GAI C , XIAO C , et al . Object detection networks based on refined multi-scale depth feature [J]. Acta Electronica Sinica , 2020 , 48 ( 12 ): 2360 - 2366 . (in Chinese)

ZHAO H , SHI J , QI X , et al . Pyramid scene parsing network [C]// IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 2881 - 2890 .

CHEN L C , PAPANDREOU G , KOKKINOS I , et al . Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 40 ( 4 ): 834 - 848 .

SHI W , CABALLERO J , HUSZAR F , et al . Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network [C]// IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 1874 - 1883 .

HE K , ZHANG X , REN S , et al . Deep residual learning for image recognition [C]// IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 770 - 778 .

FAN Y , YU J , HUANG T S . Wide-activated deep residual networks based restoration for BPG-compressed images [C]// IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 2621 - 2624 .

JIN T , HUANG S , LI Y , et al . Low-rank HOCA: Efficient high-order cross-modal attention for video captioning [C]// 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing . Hong Kong : ACL , 2019 : 2001 - 2011 .

Middlebury . The Middlebury Stereo Datasets [DB/OL]. ( 2015 ). http://vision.middlebury.edu/stereo/data/ http://vision.middlebury.edu/stereo/data/ .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

面向时序异常检测的可变视距多向扫描方法

基于稀疏平滑自蒸馏的差分隐私深度学习方法

基于非一般类算子融合方法及硬件架构设计

基于注意力融合多尺度特征的解压缩点云质量增强方法

基于深度压缩感知的联合信源信道编码方法研究