

浏览全部资源
扫码关注微信
华南理工大学电子与信息学院,广东广州 510640
Received:15 August 2022,
Revised:2023-03-20,
Published:25 November 2023
移动端阅览
杨鑫,杨春玲.基于MAP的多信息流梯度更新与聚合视频压缩感知重构算法[J].电子学报,2023,51(11):3320-3330.
YANG Xin,YANG Chun-ling.MAP-Based Multi-Information Flow Gradient Update and Aggregation for Video Compressed Sensing Reconstruction[J].ACTA ELECTRONICA SINICA,2023,51(11):3320-3330.
杨鑫,杨春玲.基于MAP的多信息流梯度更新与聚合视频压缩感知重构算法[J].电子学报,2023,51(11):3320-3330. DOI: 10.12263/DZXB.20220958.
YANG Xin,YANG Chun-ling.MAP-Based Multi-Information Flow Gradient Update and Aggregation for Video Compressed Sensing Reconstruction[J].ACTA ELECTRONICA SINICA,2023,51(11):3320-3330. DOI: 10.12263/DZXB.20220958.
现有优秀的基于深度学习的分布式视频压缩感知(Distributed Compressed Video Sensing,DCVS)重构算法利用测量值和参考帧顺序更新非关键帧,获得了较好的重构性能,但由于缺乏较严格的理论指导,无法充分结合这两类信息,限制了非关键帧重构质量的进一步提升.针对该问题,本文首先利用贝叶斯理论及最大后验概率(Maximum A Posteriori,MAP)估计推导出DCVS中非关键帧重构的优化方程,再基于近端梯度算法推导出优化方程的求解框架,包含多信息流梯度更新聚合方程.基于此,本文设计了多信息流梯度更新及聚合模块(Multi-Information flow Gradient update and Aggregation,MIGA),并构建了深度多信息流梯度更新与聚合网络(Deep Multi-Information flow Gradient update and Aggregation Network,DMIGAN)用于DCVS非关键帧重构.MIGA利用测量值与多参考帧对当前非关键帧进行并行梯度更新,再做信息交互融合,从而充分结合多种信息流更新重构帧.本文级联MIGA与去噪子网络用于模拟近端梯度算法的单次迭代,作为基础模块(phase),并通过级联多个phase构造深度重构网络DMIGAN,实现帧重构的深度优化过程.实验表明,DMIGAN与具代表性的传统迭代优化算法结构相似的帧间组稀疏表示重构算法(Structural SIMilarity based Inter-Frame Group Sparse Representation,SSIM-Inter F-GSR)相比,在低采样率与高采样率下性能分别提升了8.8 dB和7.36 dB;和具有代表性的深度学习重构算法VCSNet-2相比,在低采样率和高采样率下性能分别提升了7.09 dB和8.78 dB.
Due to the lack of guidance from the parallel update theoretical solver framework
existing deep learning-based distributed compressed video sensing (DCVS) algorithms alternately use measurement values and reference frames to optimize the reconstructed non-key frame
resulting in the inability to fully combine the two types of information and limiting the quality of reconstruction. In order to solve this problem
this paper firstly uses Bayesian theory and maximum a posteriori estimation (MAP) to derive the optimization equation of non-key frame reconstruction in DCVS
and then derives the solution framework of the optimization equation based on the proximal gradient algorithm
including multi-information flow gradient update and aggregation equation. Based on it
this paper designs a multi-information flow gradient update and aggregation neural network module (MIGA)
and constructs a deep multi-information flow gradient update and aggregation network (DMIGAN) for DCVS non-key frame reconstruction. MIGA uses the measurement values and multiple reference frames to update the current non-key frame by parallel gradients
and then performs information interaction and fusion
so as to fully combine multiple information flows to update and reconstruct the frame. In this paper
the MIGA and the denoising sub-network are cascaded to simulate a single iteration of the proximal gradient algorithm as the basic phase. The deep reconstruction network DMIGAN is constructed by cascading multiple phases to realize the deep optimization process of frame reconstruction. Experiments show that
compared with the representative traditional iterative optimization algorithm structural similarity based inter-frame group sparse representation (SSIM-InterF-GSR)
the performance of DMIGAN is improved by 8.8 dB and 7.36 dB at low sampling rate and high sampling rate respectively; and compared with the representative deep learning reconstruction algorithm VCSNet-2
the performance is improved by 7.09 dB and 8.78 dB at low sampling rate and high sampling rate
respectively.
KANG L W , LU C S . Distributed compressive video sensing [C ] // 2009 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2009 : 1169 - 1172 .
DONOHO D L . Compressed sensing [J ] . IEEE Transactions on Information Theory , 2006 , 52 ( 4 ): 1289 - 1306 .
GAN L . Block compressed sensing of natural images [C ] // 2007 15th International Conference on Digital Signal Processing . Piscataway : IEEE , 2007 : 403 - 406 .
WIEGAND T , SULLIVAN G J , BJONTEGAARD G , et al . Overview of the H. 264/AVC video coding standard [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2003 , 13 ( 7 ): 560 - 576 .
SCHWARZ H , MARPE D , WIEGAND T . Overview of the scalable video coding extension of the H.264/AVC standard [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2007 , 17 ( 9 ): 1103 - 1120 .
PAN Z Q , LEI J J , ZHANG Y , et al . Fast motion estimation based on content property for low-complexity H.265/HEVC encoder [J ] . IEEE Transactions on Broadcasting , 2016 , 62 ( 3 ): 675 - 684 .
ZHENG S , ZHANG X P , CHEN J , et al . A high-efficiency compressed sensing-based terminal-to-cloud video transmission system [J ] . IEEE Transactions on Multimedia , 2019 , 21 ( 8 ): 1905 - 1920 .
XIAO D , LI M , WANG M D , et al . Low-cost and high-efficiency privacy-protection scheme for distributed compressive video sensing in wireless multimedia sensor networks [J ] . Journal of Network and Computer Applications , 2020 , 161 : 102654 .
MUN S , FOWLER J E . Residual reconstruction for block-based compressed sensing of video [C ] // 2011 Data Compression Conference . Piscataway : IEEE , 2011 : 183 - 192 .
TRAMEL E W , FOWLER J E . Video compressed sensing with multihypothesis [C ] // 2011 Data Compression Conference . Piscataway : IEEE , 2011 : 193 - 202 .
ZHAO C , MA S W , ZHANG J , et al . Video compressive sensing reconstruction via reweighted residual sparsity [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2017 , 27 ( 6 ): 1182 - 1195 .
和志杰 , 杨春玲 , 汤瑞东 . 视频压缩感知中基于结构相似的帧间组稀疏表示重构算法研究 [J ] . 电子学报 , 2018 , 46 ( 3 ): 544 - 553
HE Z J , YANG C L , TANG R D . Research on structural similarity based inter-frame group sparse representation for compressed video sensing [J ] . Acta Electronica Sinica , 2018 , 46 ( 3 ): 544 - 553 (in Chinese)
XU K , REN F B . CSVideoNet: A real-time end-to-end learning framework for high-frame-rate video compressive sensing [C ] // 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) . Piscataway : IEEE , 2018 : 1680 - 1688 .
SHI W Z , LIU S H , JIANG F , et al . Video compressed sensing using a convolutional neural network [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2021 , 31 ( 2 ): 425 - 438 .
CHEN C , WU Y T , ZHOU C , et al . JsrNet: A joint sampling-reconstruction framework for distributed compressive video sensing [J ] . Sensors , 2019 , 20 ( 1 ): 206 .
XUAN Y Y , YANG C L . 2Ser-vgsr-net: A two-stage enhancement reconstruction based on video group sparse representation network for compressed video sensing [C ] // 2020 IEEE International Conference on Multimedia and Expo (ICME) . Piscataway : IEEE , 2020 : 1 - 6 .
YANG X , YANG C L . Imrnet: An iterative motion compensation and residual reconstruction network for video compressed sensing [C ] // 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE , 2021 : 2350 - 2354 .
WEI Z C , YANG C L , XUAN Y Y . Efficient video compressed sensing reconstruction via exploiting spatial-temporal correlation with measurement constraint [C ] // 2021 IEEE International Conference on Multimedia and Expo (ICME) . Piscataway : IEEE , 2021 : 1 - 6 .
禤韵怡 , 杨春玲 . 基于帧间组稀疏的两阶段递归增强视频压缩感知重构网络 [J ] . 电子学报 , 2021 , 49 ( 3 ): 435 - 442
XUAN Y Y , YANG C L . Two-stage recursive enhancement reconstruction based on video inter-frame group sparse representation in compressed video sensing [J ] . Acta Electronica Sinica , 2021 , 49 ( 3 ): 435 - 442 (in Chinese)
MEINHARDT T , MOELLER M , HAZIRBAS C , et al . Learning proximal operators: Using denoising networks for regularizing inverse imaging problems [C ] // 2017 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2017 : 1799 - 1808 .
LUO G X , ZHAO N , JIANG W H , et al . MRI reconstruction using deep Bayesian estimation [J ] . Magnetic Resonance in Medicine , 2020 , 84 ( 4 ): 2246 - 2261 .
FAN X H , YANG Y , ZHANG J P . Deep geometric distillation network for compressive sensing MRI [C ] // 2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI) . Piscataway : IEEE , 2021 : 1 - 4 .
FAN X H , YANG Y , CHEN K , et al . A unifying multi-sampling-ratio CS-MRI framework with two-grid-cycle correction and geometric prior distillation [EB/OL ] . [ 2022-05-14 ] . https://arxiv.org/abs/2205.07062 https://arxiv.org/abs/2205.07062 .
MOU C , WANG Q , ZHANG J . Deep generalized unfolding networks for image restoration [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 17399 - 17410 .
YANG X , YANG C L , CHEN W J . A hybrid sampling and gradient attention network for compressed image sensing [J ] . The Visual Computer , 2022 : 1 - 14 .
JIANG H Z , SUN D Q , JAMPANI V , et al . Super SloMo: High quality estimation of multiple intermediate frames for video interpolation [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 9000 - 9008 .
BAO W B , LAI W S , MA C , et al . Depth-aware video frame interpolation [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 3698 - 3707 .
BAO W B , LAI W S , ZHANG X Y , et al . MEMC-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 3 ): 933 - 948 .
RANJAN A , BLACK M J . Optical flow estimation using a spatial pyramid network [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 2720 - 2729 .
SUN D Q , YANG X D , LIU M Y , et al . PWC-net: CNNs for optical flow using pyramid, warping, and cost volume [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 8934 - 8943 .
HUI T W , TANG X O , LOY C C . A lightweight optical flow CNN-revisiting data fidelity and regularization [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 8 ): 2555 - 2569 .
SOOMRO K , ZAMIR A R , SHAH M . UCF101: A dataset of 101 human actions classes from videos in the wild [EB/OL ] . ( 2012-12-03 )[ 2022-08-13 ] . https://arxiv.org/abs/1212.0402 https://arxiv.org/abs/1212.0402 .
KINGMA D P , BA J . Adam: A method for stochastic optimization [EB/OL ] . ( 2014-12-22 )[ 2022-08-13 ] . https: //arxiv.org/abs/1412.6980 https://arxiv.org/abs/1412.6980 .
WANG Z , BOVIK A C , SHEIKH H R , et al . Image quality assessment: From error visibility to structural similarity [J ] . IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society , 2004 , 13 ( 4 ): 600 - 612 .
0
Views
11
下载量
2
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621