1.重庆邮电大学通信与信息工程学院,重庆 400065
2.信号与信息处理重庆市重点实验室,重庆 400065
[ "杨紫媛 女,1998年5月出生于重庆市.现为重庆邮电大学通信与信息工程学院研究生.主要研究方向为图像协调化、计算机视觉和机器学习. E-mail: 785459971@qq.com" ]
[ "李鹏程 男,1995年12月出生于重庆市,现为重庆邮电大学通信与信息工程学院博士生.主要研究方向为智能医学影像分析、计算机视觉和机器学习. E-mail: lipengchengme@163.com" ]
[ "刘芳岑 女,1995年出生于重庆市,现于重庆邮电大学攻读博士学位.主要研究方向为红外小目标检测. E-mail: liufc67@gmail.com" ]
[ "高陈强(通讯作者) 男,1981年8月生于重庆市,现为重庆邮电大学人才工作办公室副主任、教授、博士生导师.主要研究方向为图像处理、视频分析和机器学习." ]
收稿:2022-11-18,
修回:2023-03-13,
纸质出版:2023-07-25
移动端阅览
杨紫媛,李鹏程,刘芳岑等.基于语义信息引导的图像协调化[J].电子学报,2023,51(07):1826-1834.
YANG Zi-yuan,LI Peng-cheng,LIU Fang-cen,et al.Image Harmonization Guided by Semantic Information[J].ACTA ELECTRONICA SINICA,2023,51(07):1826-1834.
杨紫媛,李鹏程,刘芳岑等.基于语义信息引导的图像协调化[J].电子学报,2023,51(07):1826-1834. DOI: 10.12263/DZXB.20221322.
YANG Zi-yuan,LI Peng-cheng,LIU Fang-cen,et al.Image Harmonization Guided by Semantic Information[J].ACTA ELECTRONICA SINICA,2023,51(07):1826-1834. DOI: 10.12263/DZXB.20221322.
图像协调化在图像处理中占据着一个重要的地位,它旨在调整前景外观(如光照、颜色、纹理等)使其与背景在视觉上保持一致.然而,现有的基于深度学习方法通常将图像整体背景的特征分布作为线索来调整前景,没有注重语义信息对前景调整的关键作用,导致前景的局部区域与背景在视觉上出现差异.为此,本文基于多分辨率选择融合模块(Multi-Resolution Selective Fusion Module,MRSFM)和轻量级的卷积块注意力模块(Convolutional Block Attention Module,CBAM),设计了一个基于双注意力机制的多分辨率选择融合模块(Multi-Resolution Selective Fusion module based on Dual Attention Mechanism,MRSF-DAM),使得最后输出的特征图具有丰富的语义信息,从而引导网络更好地理解图像前景与它周围场景之间的相关性,使网络更加充分地从背景中获取协调前景所需的各种信息,最终缩小图像前景区域和背景区域在视觉上的外观差异.此外,本文设计了一个新的网络架构来选择融合浅层和深层的特征信息,通过对解码器前6层网络层与MRSF-DAM的输出特征图进行多尺度融合和增强,将产生的增强特征图送入解码器的最后层,能够缓解由跳跃连接引入的与前景内容的特征不相关的问题,且减少了由于解码器经过多次下采样带来的空间特征信息损失,进一步提高生成协调图像的真实性.在广泛使用的iHarmony4基准数据集上进行了大量的实验验证了本文方法的有效性.相比于目前最新的方法SCS-Co(Self-Consistent Style Contrastive learning for image harmonization),本文方法在整个数据集的均方误差(Mean Squared Error,MSE)、前景均方误差(foreground Mean Squared Error,fMSE)和峰值信噪比(Peak Signal-to-Noise Ratio,PSNR)上分别提升了4.28,61.97和1 dB.
Image harmonization occupies an important position in image processing. It aims to adjust the foreground appearance
e.g.
illumination
color
texture
etc.
to be visually consistent with the background. However
existing deep learning-based methods usually use the feature distribution of the overall image background as a cue to adjust the foreground
without focusing on the critical role of semantic information for foreground alignment
resulting in local areas in the foreground to appear visually different from the background. To this end
based on the multi-resolution selective fusion module (MRSFM) and the lightweight convolutional block attention module (CBAM)
this paper designs a multi-resolution selective fusion module based on dual attention mechanism (MRSF-DAM)
which makes the final output feature map rich in semantic information
thus guiding the network to better understand the correlation between the foreground of an image and its surrounding scene
more enabling the network to fully obtain the various information needed to coordinate the foreground from the background
and eventually reducing the visual discrepancy between the foreground and background regions of an image. In addition
this article designs a new network architecture to selectively fuse the shallow and deep feature information. By multi-scale fusion and enhancement of the output feature maps of the first six network layers of the decoder and MRSF-DAM
the generated enhanced feature maps are fed into the final layer of the decoder
which can alleviate the problem introduced by skip connections of the unrelated features to the foreground
and besides
it reduces the loss of spatial feature information caused by multiple downsampling of the decoder
further improving the authenticity of the generated harmonized images. A large number of experiments were conducted on the widely used iHarmony4 benchmark dataset to verify the effectiveness of our method. Compared to the latest method SCS Co (Self Consistent Style Comparative learning for image harmonization)
this proposed method improves the mean squared error (MSE)
foreground mean squared error (fMSE) and peak signal to noise ratio (PSNR) of the entire dataset by 4.28
61.97
and 1 dB
respectively.
CHU L T , LIU Y , WU Z W , et al . Pp-humanseg: Connectivity-aware portrait segmentation with a large-scale teleconferencing video dataset [C]// 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) . Piscataway : IEEE , 2022 : 202 - 209 .
GAO Qi-fan , WU Xiao-lin . Real-time deep image retouching based on learnt semantics dependent global transforms [J]. IEEE Transactions on Image Processing , 2021 , 30 : 7378 - 7390 .
CHEN B C , KAE A . Toward realistic image compositing with adversarial learning [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 8407 - 8416 .
徐少平 , 陈孝国 , 李芬 , 等 . 采用两阶段混合策略实现的低照度图像增强算法 [J]. 电子学报 , 2021 , 49 ( 11 ): 2166 - 2170 .
XU Shao-ping , CHEN Xiao-guo , LI Fen , et al . A low‑light image enhancement algorithm using two‑stage hybrid strategy [J]. Acta Electronica Sinica , 2021 , 49 ( 11 ): 2166 - 2170 . (in Chinese)
IIZUKA S , SIMO-SERRA E , ISHIKAWA H . Globally and locally consistent image completion [J]. ACM Transactions on Graphics , 2017 , 36 ( 4 ): 1 - 14 .
ZHENG C , CHAM T J , CAI J , et al . Bridging global context interactions for high-fidelity image completion [C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 11502 - 11512 .
BULAT A , YANG J , TZIMIROPOULOS G . To learn image super-resolution, use a GAN to learn how to do image degradation first [C]// Computer Vision - ECCV 2018 . Cham : Springer International Publishing , 2018 : 187 - 202 .
王相海 , 赵晓阳 , 王鑫莹 , 等 . 非抽取小波边缘学习深度残差网络的单幅图像超分辨率重建 [J]. 电子学报 , 2022 , 50 ( 7 ): 1753 - 1765 .
WANG Xiang-hai , ZHAO Xiao-yang , WANG Xin-ying , et al . Single image super-resolution reconstruction using deep residual networks with non-decimated wavelet edge learning [J]. Acta Electronica Sinica , 2022 , 50 ( 7 ): 1753 - 1765 . (in Chinese)
周登文 , 李文斌 , 李金新 , 等 . 一种轻量级的多尺度通道注意图像超分辨率重建网络 [J]. 电子学报 , 2022 , 50 ( 10 ): 2336 - 2346 .
ZHOU Deng-wen , LI Wen-bin , LI Jin-xin , et al . Image super-resolution reconstruction based on lightweight multi-scale channel attention network [J]. Acta Electronica Sinica , 2022 , 50 ( 10 ): 2336 - 2346 . (in Chinese)
李大锦 , 高文冉 , 高俊杰 . 基于kuwahara滤波的视频风格化框架 [J]. 电子学报 , 2020 , 48 ( 3 ): 538 - 544 .
LI Da-jin , GAO Wen-ran , GAO Jun-jie . Artistic video stylization based on kuwahara filter [J]. Acta Electronica Sinica , 2020 , 48 ( 3 ): 538 - 544 . (in Chinese)
AN J , HUANG S Y , SONG Y B , et al . ArtFlow: unbiased image style transfer via reversible neural flows [C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 862 - 871 .
WU X , HU Z , SHENG L , et al . Styleformer: Real-time arbitrary style transfer via parametric style composition [C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 14618 - 14627 .
REINHARD E , ADHIKHMIN M , GOOCH B , et al . Color transfer between images [J]. IEEE Computer Graphics and Applications , 2001 , 21 ( 5 ): 34 - 41 .
XIAO X Z , MA L Z . Color transfer in correlated color space [C]// Proceedings of the 2006 ACM International Conference on Virtual Reality Continuum and Its Applications . New York : ACM , 2006 : 305 - 309 .
FECKER U , BARKOWSKY M , KAUP A . Histogram-based prefiltering for luminance and chrominance compensation of multiview video [J]. IEEE Transactions on Circuits and Systems for Video Technology , 2008 , 18 ( 9 ): 1258 - 1267 .
PITIÉ F , KOKARAM A C , DAHYOT R . Automated colour grading using colour distribution transfer [J]. Computer Vision and Image Understanding , 2007 , 107 ( 1/2 ): 123 - 137 .
SUNKAVALLI K , JOHNSON M K , MATUSIK W , et al . Multi-scale image harmonization [J]. ACM Transactions on Graphics , 2010 , 29 ( 4 ): 1 - 10 .
SONG S B , ZHONG F , QIN X Y , et al . Illumination Harmonization with Gray Mean Scale [M]// Advances in Computer Graphics . Cham : Springer International Publishing , 2020 : 193 - 205 .
LALONDE J F , EFROS A . Using color compatibility for assessing image realism [C]// 2007 IEEE 11th International Conference on Computer Vision (CVPR) . Piscataway : IEEE , 2007 : 1 - 8 .
XUE S , AGARWALA A , DORSEY J , et al . Understanding and improving the realism of image composites [J]. ACM Transactions on Graphics , 2012 , 31 ( 4 ): 1 - 10 .
CONG W , ZHANG J , NIU L , et al . Dovenet: deep image harmonization via domain verification [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 8391 - 8400 .
CONG W , NIU L , ZHANG J , et al . BargainNet: Background-guided domain translation for image harmonization [C]// 2021 IEEE International Conference on Multimedia and Expo (ICME) . Piscataway : IEEE , 2021 : 1 - 6 .
LING J , XUE H , SONG L , et al . Region-aware adaptive instance normalization for image harmonization [C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 9357 - 9366 .
HANG Y , XIA B , YANG W , et al . Scs-co: Self-consistent style contrastive learning for image harmonization [C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 19678 - 19687 .
GUO Z , GUO D , ZHENG H , et al . Image harmonization with transformer [C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 14850 - 14859 .
GUO Z , ZHENG H , JIANG Y , et al . Intrinsic image harmonization [C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 16362 - 16371 .
JIANG Y , ZHANG H , ZHANG J , et al . SSH: A self-supervised framework for image harmonization [C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 4812 - 4821 .
TSAI Y H , SHEN X , LIN Z , et al . Deep image harmonization [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 2799 - 2807 .
SOFIIUK K , POPENOVA P , KONUSHIN A . Foreground-aware semantic representations for image harmonization [C]// 2021 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) . Piscataway : IEEE , 2021 : 1619 - 1628 .
CONG W , TAO X , NIU L , et al . High-resolution image harmonization via collaborative dual transformations [C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 18449 - 18458 .
CUN X D , PUN C M . Improving the harmony of the composite image by spatial-separated attention module [J]. IEEE Transactions on Image Processing , 2020 , 29 : 4759 - 4771 .
HU J , SHEN L , SUN G . Squeeze-and-excitation networks [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7132 - 7141 .
HU J , SHEN L , ALBANIE S , et al . Gather-excite: Exploiting feature context in convolutional neural networks [C]. Proceedings of the 32nd International Conference on Neural Information Processing Systems . New York : ACM , 2018 : 9423 - 9433 .
WANG X L , GIRSHICK R , GUPTA A , et al . Non-local neural networks [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2018 : 7794 - 7803 .
WOO S , PARK J , LEE J Y , et al . CBAM: Convolutional block attention module [C]// Computer Vision - ECCV 2018 . Cham : Springer International Publishing , 2018 : 3 - 19 .
YANG Z X , ZHU L C , WU Y , et al . Gated channel transformation for visual recognition [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 11791 - 11800 .
ZHANG R . Making convolutional networks shift-invariant again [C]// Proceedings of International Conference on Machine Learning . New York : ACM , 2019 : 7324 - 7334 .
IOFFE S , SZEGEDY C . Batch normalization: Accelerating deep network training by reducing internal covariate shift [C]// Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 . New York : ACM , 2015 : 448 - 456 .
0
浏览量
11
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621