A Global Matching Optimization Approach for Optical Flow Estimation Using Joint Depth-Separable Residual Blocks and Multi-Scale Dual-Channel Attention

WANG Zi-xu; CHEN Hong-ye; GE Li-yue; ZHANG Cong-xuan; CHEN Zhen; WANG Zi-ge

doi:10.12263/DZXB.20240818

您当前的位置：

首页 >

文章列表页 >

A Global Matching Optimization Approach for Optical Flow Estimation Using Joint Depth-Separable Residual Blocks and Multi-Scale Dual-Channel Attention

PAPERS | 更新时间：2025-08-18

- A Global Matching Optimization Approach for Optical Flow Estimation Using Joint Depth-Separable Residual Blocks and Multi-Scale Dual-Channel Attention
- ACTA ELECTRONICA SINICA Vol. 53, Issue 5, Pages: 1622-1636(2025)
- 作者机构：
  
  1.南昌航空大学仪器科学与光电工程学院，江西南昌 330063
  2.西北工业大学计算机学院，陕西西安 710129
  3.南昌航空大学信息工程学院，江西南昌 330063
  4.北京航空航天大学仪器科学与光电工程学院，北京 100019
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(62222206;62272209);Major Research and Development Project of Jiangxi(20232ACC01007);Natural Science Foundation of Jiangxi(20242BAB20048)
- DOI：10.12263/DZXB.20240818
  CLC： TP391;
- Received：03 September 2024，
  
  Revised：2025-02-23，
  
  Published：25 May 2025
- 稿件说明：
移动端阅览
王子旭, 陈弘烨, 葛利跃, 等. 联合深度可分离残差与多尺度双通道注意力的全局匹配优化光流估计方法[J]. 电子学报, 2025, 53(05): 1622-1636.

WANG Zi-xu, CHEN Hong-ye, GE Li-yue, et al. A Global Matching Optimization Approach for Optical Flow Estimation Using Joint Depth-Separable Residual Blocks and Multi-Scale Dual-Channel Attention[J]. Acta Electronica Sinica, 2025, 53(05): 1622-1636.
王子旭, 陈弘烨, 葛利跃, 等. 联合深度可分离残差与多尺度双通道注意力的全局匹配优化光流估计方法[J]. 电子学报, 2025, 53(05): 1622-1636. DOI：10.12263/DZXB.20240818

WANG Zi-xu, CHEN Hong-ye, GE Li-yue, et al. A Global Matching Optimization Approach for Optical Flow Estimation Using Joint Depth-Separable Residual Blocks and Multi-Scale Dual-Channel Attention[J]. Acta Electronica Sinica, 2025, 53(05): 1622-1636. DOI：10.12263/DZXB.20240818

摘要

随着深度学习理论与技术的快速发展，基于深度学习的光流估计方法在计算精度与鲁棒性方面取得显著提升.然而，受标准卷积感受野局部属性和现有匹配代价体积策略容易产生匹配歧义的限制，当前方法在大位移运动和弱纹理区域普遍存在光流估计精度较低，运动模糊现象较严重的问题.针对上述问题，本文提出一种联合深度可分离残差与多尺度双通道注意力的全局匹配优化光流估计方法.首先，构建联合深度可分离残差块与多尺度双通道注意力的编码模块，在平衡参数量与运算速度的同时获取连续帧间更准确的深度特征.然后，设计基于可学习的全局匹配优化光流估计策略，通过排除遮挡并高效利用全局匹配信息，有效缓解因匹配歧义引起的运动模糊.最后，为了提高模型的训练稳定性与泛化性，本文提出联合全局与局部的光流损失函数，约束模型训练.实验分别采用MPI-Sintel、KITTI-2015和Middlebury测试数据集对本文方法和现有代表性方法进行综合对比分析.结果表明，本文方法在所有对比方法中取得了最优的光流估计精度，尤其在大位移和弱纹理区域具有更好的准确性和鲁棒性.

Abstract

With the rapid development of deep learning theory and technology

deep learning-based optical flow estimation methods have significantly improved in computational accuracy and robustness. However

due to the limitations of standard convolution’s local receptive field and existing matching cost volume strategies that can lead to matching ambiguities

current methods often suffer from low accuracy in optical flow estimation and severe motion blur

particularly in large displacement motions and weak-texture regions. To address these issues

this paper proposes a global matching optimization optical flow estimation method combining deep separable residuals with multi-scale dual-channel attention. First

an encoding module is constructed that integrates deep separable residual blocks with multi-scale dual-channel attention

achieving more accurate depth features between consecutive frames while balancing parameter count and computational speed. Then

a learnable global matching optimization strategy for optical flow estimation is designed

which alleviates motion blur caused by matching ambiguities by excluding occlusions and efficiently utilizing global matching information. Finally

to enhance the model’s training stability and generalization

a combined global and local optical flow loss function is proposed to constrain model training. Experiments conducted on the MPI-Sintel

KITTI-2015 and Middlebury test datasets demonstrate that the proposed method achieves the best optical flow estimation accuracy among all compared methods

especially showing better accuracy and robustness in large displacement and weak-texture regions.

关键词

Keywords

references

柯逍 , 缪欣 , 郭文忠 . 基于时空交叉感知的实时动作检测方法 [J ] . 电子学报 , 2024 , 52 ( 2 ): 574 - 588 .

KE X , MIAO X , GUO W Z . Real-time action detection based on spatio-temporal interaction perception [J ] . Acta Electronica Sinica , 2024 , 52 ( 2 ): 574 - 588 . (in Chinese)

王正文 , 宋慧慧 , 樊佳庆 , 等 . 基于语义引导特征聚合的显著性目标检测网络 [J ] . 自动化学报 , 2023 , 49 ( 11 ): 2386 - 2395 .

WANG Z W , SONG H H , FAN J Q , et al . Semantic guided feature aggregation network for salient object detection [J ] . Acta Automatica Sinica , 2023 , 49 ( 11 ): 2386 - 2395 . (in Chinese)

杨鑫 , 杨春玲 . 基于MAP的多信息流梯度更新与聚合视频压缩感知重构算法 [J ] . 电子学报 , 2023 , 51 ( 11 ): 3320 - 3330 .

YANG X , YANG C L . MAP-based multi-information flow gradient update and aggregation for video compressed sensing reconstruction [J ] . Acta Electronica Sinica , 2023 , 51 ( 11 ): 3320 - 3330 . (in Chinese)

李公平 , 陆耀 , 王子建 , 等 . 基于模糊核估计的图像盲超分辨率神经网络 [J ] . 自动化学报 , 2023 , 49 ( 10 ): 2109 - 2121 .

LI G P , LU Y , WANG Z J , et al . Blurred image blind super-resolution network via ke rnel estimation [J ] . Acta Automatica Sinica , 2023 , 49 ( 10 ): 2109 - 2121 . (in Chinese)

ZHENG Z H , NIE N , LING Z , et al . DIP: Deep inverse patchmatch for high-resolution optical flow [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 8915 - 8924 .

张聪炫 , 史世栋 , 葛利跃 , 等 . 基于遮挡优化的金字塔块匹配光流估计方法 [J ] . 电子学报 , 2023 , 51 ( 9 ): 2539 - 2548 .

ZHANG C X , SHI S D , GE L Y , et al . Pyramid patch-matching optical flow estimation method based on occlusion optimization [J ] . Acta Electronica Sinica , 2023 , 51 ( 9 ): 2539 - 2548 . (in Chinese)

ZHAI M L , XIANG X Z , LV N , et al . Optical flow and scene flow estimation: A survey [J ] . Pattern Recognition , 2021 , 114 : 107861 .

江頔 , 陈震 , 危水根 , 等 . 基于结构张量的变分光流计算方法 [J ] . 南昌航空大学学报(自然科学版) , 2011 , 25 ( 2 ): 48 - 53 .

JIANG D , CHEN Z , WEI S G , et al . A variational calculation for optical flow based on structure tensor [J ] . Journal of Nanchang Hangkong University (Natural Sciences) , 2011 , 25 ( 2 ): 48 - 53 . (in Chinese)

KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks [J ] . Communications of the ACM , 2017 , 60 ( 6 ): 84 - 90 .

范兵兵 , 何庭建 , 张聪炫 , 等 . 联合遮挡约束与残差补偿的特征金字塔光流计算方法 [J ] . 电子学报 , 2023 , 51 ( 3 ): 648 - 657 .

FAN B B , HE T J , ZHANG C X , et al . Feature pyramid optical flow estimation method jointing occlusion constraint and residual compensation [J ] . Acta Electronica Sinica , 2023 , 51 ( 3 ): 648 - 657 . (in Chinese)

DOSOVITSKIY A , FISCHER P , ILG E , et al . FlowNet: Learning optical flow with convolutional networks [C ] // 2015 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2015 : 2758 - 2766 .

SUN D Q , YANG X D , LIU M Y , et al . Models matter, so does training: An empirical study of CNNs for optical flow estimation [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2020 , 42 ( 6 ): 1408 - 1423 .

TEED Z , DENG J . RAFT: Recurrent all-pairs field transforms for optical flow [M ] // Computer Vision-ECCV 2020 . Cham : Springer International Publishing , 2020 : 402 - 419 .

JIANG S H , CAMPBELL D , LU Y , et al . Learning to estimate hidden motions with global motion aggregation [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 9752 - 9761 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .

SZEGEDY C , LIU W , JIA Y Q , et al . Going deeper with convolutions [C ] // 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2015 : 1 - 9 .

ILG E , MAYER N , SAIKIA T , et al . FlowNet 2.0: Evolution of optical flow estimation with deep networks [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 1647 - 1655 .

RANJAN A , BLACK M J . Optical flow estimation using a spatial pyramid network [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 2720 - 2729 .

HUI T W , TANG X O , LOY C C . LiteFlowNet: A lightweight convolutional neural network for optical flow estimation [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 8981 - 8989 .

HUR J , ROTH S . Iterative residual refinement for joint optical flow and occlusion estimation [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 5747 - 5756 .

YANG G S , DEVA R . Volumetric correspondence networks for optical flow [J ] . Neural Information Processing Systems , 2019 , 1 : 545 - 554 .

ZHAO S Y , SHENG Y L , DONG Y , et al . MaskFlownet: Asymmetric feature matching with learnable occlusion mask [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 6277 - 6286 .

ZHANG F H , WOODFORD O J , PRISACARIU V , et al . Separable flow: Learning motion cost volumes for optical flow estimation [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 10787 - 10797 .

SUN S K , CHEN Y Q , ZHU Y , et al . Skflow: Learning optical flow with super kernels [C ] // Advances in Neural Information Processing Systems . Red Hook : Curran Associates , 2022 : 11313 - 11326 .

XU H F , ZHANG J , CAI J F , et al . GMFlow: Learning optical flow via global matching [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 8111 - 8120 .

舒铭奕 , 张聪炫 , 陈震 , 等 . 基于局部-全局建模与视觉相似引导的光流估计方法 [J ] . 中国科学: 信息科学 , 2023 , 53 ( 10 ): 1945 - 1964 .

SHU M Y , ZHANG C X , CHEN Z , et al . Optical flow estimation based on local-global modeling and visual similarity guidance [J ] . Scientia Sinica (Informationis) , 2023 , 53 ( 10 ): 1945 - 1964 . (in Chinese)

BUTLER D J , WULFF J , STANLEY G B , et al . A naturalistic open source movie for optical flow evaluation [M ] // Computer Vision-ECCV 2012 . Berlin : Springer Berlin Heidelberg , 2012 : 611 - 625 .

MENZE M , GEIGER A . Object scene flow for autonomous vehicles [C ] // 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2015 : 3061 - 3070 .

MAYER N , ILG E , HÄUSSER P , et al . A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 4040 - 4048 .

KONDERMANN D , NAIR R , HONAUER K , et al . The HCI benchmark suite: Stereo and flow ground truth with uncertainties for urban autonomous driving [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Piscataway : IEEE , 2016 : 19 - 28 .

TANG X , YANG M , SUN P H , et al . PaReNeRF: Toward fast large-scale dynamic NeRF with patch-based reference [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2024 : 5428 - 5438 .

JEONG J , LIN J M , PORIKLI F , et al . Imposing consistency for optical flow estimation [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 3171 - 3181 .

ZHAO S Y , ZHAO L , ZHANG Z X , et al . Global matching with overlapping attention for optical flow estimation [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 17571 - 17580 .

SUI X C , LI S H , GENG X , et al . CRAFT: Cross-attentional flow transformer for robust optical flow [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 17581 - 17590 .

LUO A , YANG F , LUO K , et al . Learning optical flow with adaptive graph reasoning [C ] // Proceedings of the AAAI Conference on Artificial Intelligence(AAAI) . Menlo Park : AAAI , 2022 , 36 ( 2 ): 1890 - 1898 .

CHENG R , HE R A , JIANG X H , et al . Context-aware iteration policy network for efficient optical flow estimation [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 2 ): 1299 - 1307 .

FENG M J , JIA H , YAN Z Q , et al . APCAFlow: All-pairs cost volume aggregation for optical flow estimation [J ] . IEEE Transactions on Multimedia , 2024 , 26 : 9060 - 9069 .

WANG H , FAN R , LIU M . CoT-AMFlow: Adaptive modulation network with co-teaching strategy for unsupervised optical flow estimation [C ] // Conference on Robot Learning (CoRL) . New York : PMLR , 2021 : 143 - 155 .

BAKER S , SCHARSTEIN D , LEWIS J P , et al . A database and evaluation methodology for optical flow [J ] . International Journal of Computer Vision , 2011 , 92 ( 1 ): 1 - 31 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Feature Pyramid Optical Flow Estimation Method Jointing Occlusion Constraint and Residual Compensation

Motion Blur Extent Evaluation of SAR Images

Detection of Moving Object Using a Fusion Method Based on Segmentation of Optical Flow Field and Edge Extracted by Canny's Operator

Related Author

LI Ming

CHEN Zhen

ZHANG Cong-xuan

HE Ting-jian

FAN Bing-bing

LIU Zheng-kai

ZHANG Rong

YANG Jian-chao

Related Institution

Institute of Automation， Chinese Academy of Sciences

Key Laboratory of Nondestructive Testing， Ministry of Education， Nanchang Hangkong University

中国科学技术大学电子工程与信息科学系

School of Electronic Engineering, Xidian University

Intelligent Information Institute,Shenzhen University

⁰