电子学报 ›› 2023, Vol. 51 ›› Issue (3): 648-657.DOI: 10.12263/DZXB.20211052
范兵兵1, 何庭建1, 张聪炫1,2, 陈震1, 黎明1
收稿日期:
2021-08-05
修回日期:
2021-12-29
出版日期:
2023-03-25
通讯作者:
作者简介:
基金资助:
FAN Bing-bing1, HE Ting-jian1, ZHANG Cong-xuan1,2, CHEN Zhen1, LI Ming1
Received:
2021-08-05
Revised:
2021-12-29
Online:
2023-03-25
Published:
2023-04-20
Corresponding author:
Supported by:
摘要:
针对现有深度学习光流计算模型在运动遮挡和大位移等场景下光流计算的准确性与鲁棒性问题,本文提出一种联合遮挡约束与残差补偿的特征金字塔光流计算方法.首先,构造基于遮挡掩模的光流约束模块,通过预测遮挡掩模特征图抑制变形特征的边缘伪影,克服运动遮挡区域的图像边缘模糊问题.然后,采用特征图变形策略构建基于特征变形的光流残差补偿模块,利用该模块学习到的残差光流细化原始光流场,改善大位移运动区域的光流计算效果.最后,采用特征金字塔框架构建联合遮挡约束与残差补偿的光流计算网络模型,提升大位移和运动遮挡场景下的光流计算精度.分别采用MPI-Sintel (Max-Planck Institute and Sintel)和KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute)数据集对本文方法和代表性传统光流计算方法、深度学习光流计算方法进行综合对比分析,实验结果表明本文方法相对于其他方法能够有效提升大位移和运动遮挡场景下的光流计算精度与鲁棒性.
中图分类号:
范兵兵, 何庭建, 张聪炫, 等. 联合遮挡约束与残差补偿的特征金字塔光流计算方法[J]. 电子学报, 2023, 51(3): 648-657.
Bing-bing FAN, Ting-jian HE, Cong-xuan ZHANG, et al. Feature Pyramid Optical Flow Estimation Method Jointing Occlusion Constraint and Residual Compensation[J]. Acta Electronica Sinica, 2023, 51(3): 648-657.
对比方法 | all /pixel | matched /pixel | Unmatched /pixel | 时间 /s |
---|---|---|---|---|
JOF[ | 8.818 | 4.599 | 43.175 | 654 |
CPM-Flow[ | 5.960 | 2.990 | 30.177 | 4.3 |
EpicFlow[ | 6.285 | 3.060 | 32.564 | 16 |
FlowNet2[ | 6.016 | 2.977 | 30.807 | 0.10 |
PWC-Net[ | 5.042 | 2.445 | 26.221 | 0.03 |
LiteFlowNet2[ | 4.686 | 2.248 | 24.571 | 0.04 |
UnFlow[ | 10.219 | 6.061 | 44.110 | 0.12 |
DDFlow[ | 6.176 | 2.269 | 38.053 | 0.06 |
VCN[ | 4.404 | 2.216 | 22.238 | 0.18 |
本文方法 | 4.607 | 2.482 | 21.923 | 0.20 |
表1 MPI?Sintel数据集光流计算结果及时间消耗对比
对比方法 | all /pixel | matched /pixel | Unmatched /pixel | 时间 /s |
---|---|---|---|---|
JOF[ | 8.818 | 4.599 | 43.175 | 654 |
CPM-Flow[ | 5.960 | 2.990 | 30.177 | 4.3 |
EpicFlow[ | 6.285 | 3.060 | 32.564 | 16 |
FlowNet2[ | 6.016 | 2.977 | 30.807 | 0.10 |
PWC-Net[ | 5.042 | 2.445 | 26.221 | 0.03 |
LiteFlowNet2[ | 4.686 | 2.248 | 24.571 | 0.04 |
UnFlow[ | 10.219 | 6.061 | 44.110 | 0.12 |
DDFlow[ | 6.176 | 2.269 | 38.053 | 0.06 |
VCN[ | 4.404 | 2.216 | 22.238 | 0.18 |
本文方法 | 4.607 | 2.482 | 21.923 | 0.20 |
对比方法 | d0-10/pixel | d10-60/pixel | d60-140/pixel | s0-10/ppf | s10-40/ppf | s40+/ppf |
---|---|---|---|---|---|---|
JOF[ | 7.049 | 4.617 | 3.131 | 1.170 | 4.196 | 57.923 |
CPM-Flow[ | 5.038 | 2.419 | 2.143 | 1.155 | 3.755 | 35.136 |
EpicFlow[ | 5.205 | 2.611 | 2.216 | 1.135 | 3.727 | 38.021 |
FlowNet2[ | 5.139 | 2.786 | 2.102 | 1.243 | 4.027 | 34.505 |
PWC-Net[ | 4.636 | 2.087 | 1.475 | 0.799 | 2.986 | 31.070 |
LiteFlowNet2[ | 4.048 | 1.899 | 1.473 | 0.811 | 2.433 | 29.375 |
UnFlow[ | 8.407 | 5.828 | 4.665 | 1.742 | 6.689 | 60.765 |
DDFlow[ | 4.208 | 2.084 | 1.416 | 0.860 | 2.562 | 41.337 |
VCN[ | 4.381 | 1.782 | 1.423 | 0.955 | 2.725 | 25.570 |
本文方法 | 3.935 | 1.935 | 1.842 | 1.082 | 2.387 | 27.462 |
表2 MPI?Sintel数据集遮挡边界与大位移区域光流计算结果
对比方法 | d0-10/pixel | d10-60/pixel | d60-140/pixel | s0-10/ppf | s10-40/ppf | s40+/ppf |
---|---|---|---|---|---|---|
JOF[ | 7.049 | 4.617 | 3.131 | 1.170 | 4.196 | 57.923 |
CPM-Flow[ | 5.038 | 2.419 | 2.143 | 1.155 | 3.755 | 35.136 |
EpicFlow[ | 5.205 | 2.611 | 2.216 | 1.135 | 3.727 | 38.021 |
FlowNet2[ | 5.139 | 2.786 | 2.102 | 1.243 | 4.027 | 34.505 |
PWC-Net[ | 4.636 | 2.087 | 1.475 | 0.799 | 2.986 | 31.070 |
LiteFlowNet2[ | 4.048 | 1.899 | 1.473 | 0.811 | 2.433 | 29.375 |
UnFlow[ | 8.407 | 5.828 | 4.665 | 1.742 | 6.689 | 60.765 |
DDFlow[ | 4.208 | 2.084 | 1.416 | 0.860 | 2.562 | 41.337 |
VCN[ | 4.381 | 1.782 | 1.423 | 0.955 | 2.725 | 25.570 |
本文方法 | 3.935 | 1.935 | 1.842 | 1.082 | 2.387 | 27.462 |
对比方法 | KITTI2015 | KITTI2012 | |||
---|---|---|---|---|---|
Fl-all | Fl-fg | Fl-bg | Fl-all | Fl-noc | |
CPM-Flow[ | 22.32 | 22.81 | 22.40 | 5.79 | 13.70 |
EpicFlow[ | 25.81 | 28.69 | 26.29 | 7.88 | 17.08 |
FlowNet2[ | 10.41 | 8.75 | 10.75 | 8.80 | 4.82 |
PWC-Net[ | 9.60 | 9.31 | 9.66 | 8.10 | 4.22 |
LiteFlowNet2[ | 7.62 | 7.64 | 7.62 | 6.16 | 2.63 |
UnFlow[ | 11.11 | 15.93 | 10.15 | 8.42 | 4.28 |
DDFlow[ | 14.29 | 20.40 | 13.08 | 7.34 | 4.57 |
VCN[ | 6.30 | 8.66 | 5.83 | 5.64 | 2.48 |
本文方法 | 5.91 | 8.11 | 5.46 | 5.54 | 2.38 |
表3 KITTI数据集图像序列光流计算结果对比 (%)
对比方法 | KITTI2015 | KITTI2012 | |||
---|---|---|---|---|---|
Fl-all | Fl-fg | Fl-bg | Fl-all | Fl-noc | |
CPM-Flow[ | 22.32 | 22.81 | 22.40 | 5.79 | 13.70 |
EpicFlow[ | 25.81 | 28.69 | 26.29 | 7.88 | 17.08 |
FlowNet2[ | 10.41 | 8.75 | 10.75 | 8.80 | 4.82 |
PWC-Net[ | 9.60 | 9.31 | 9.66 | 8.10 | 4.22 |
LiteFlowNet2[ | 7.62 | 7.64 | 7.62 | 6.16 | 2.63 |
UnFlow[ | 11.11 | 15.93 | 10.15 | 8.42 | 4.28 |
DDFlow[ | 14.29 | 20.40 | 13.08 | 7.34 | 4.57 |
VCN[ | 6.30 | 8.66 | 5.83 | 5.64 | 2.48 |
本文方法 | 5.91 | 8.11 | 5.46 | 5.54 | 2.38 |
对比模型 | Fl-bg /% | Fl-fg /% | Fl-all /% | 训练 时间/h | 测试 时间/s |
---|---|---|---|---|---|
Baseline | 5.83 | 8.66 | 6.30 | 227 | 0.18 |
Base_Occ | 6.13 | 8.60 | 6.54 | 255 | 0.19 |
Base_ResF | 5.44 | 8.17 | 5.89 | 241 | 0.19 |
本文方法 | 5.46 | 8.11 | 5.91 | 279 | 0.20 |
表4 本文方法不同消融模型光流计算对比结果
对比模型 | Fl-bg /% | Fl-fg /% | Fl-all /% | 训练 时间/h | 测试 时间/s |
---|---|---|---|---|---|
Baseline | 5.83 | 8.66 | 6.30 | 227 | 0.18 |
Base_Occ | 6.13 | 8.60 | 6.54 | 255 | 0.19 |
Base_ResF | 5.44 | 8.17 | 5.89 | 241 | 0.19 |
本文方法 | 5.46 | 8.11 | 5.91 | 279 | 0.20 |
1 | 吴晓军, 鞠光亮. 一种无标记点人脸表情捕捉与重现算法[J]. 电子学报, 2016, 44(9): 2141-2147. |
WU X J, JU G L. A markerless facial expression capture and reproduce algorithm[J]. Acta Electronica Sinica, 2016, 44(9): 2141-2147. (in Chinese) | |
2 | JI Y L, YANG Y, SHEN F M, et al. Arbitrary-view human action recognition: A varying-view RGB-D action dataset[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(1): 289-300. |
3 | 丁新尧, 张鑫. 基于显著性特征的选择性目标跟踪算法[J]. 电子学报, 2020, 48(1): 118-123. |
DING X Y, ZHANG X. Visual tracking with salient features and selective mechanism[J]. Acta Electronica Sinica, 2020, 48(1): 118-123. (in Chinese) | |
4 | 张聪炫, 陈震, 黎明. 单目图像序列光流三维重建技术研究综述[J]. 电子学报, 2016, 44(12): 3044-3052. |
ZHANG C X, CHEN Z, LI M. Review of the 3D reconstruction technology based on optical flow of monocular image sequence[J]. Acta Electronica Sinica, 2016, 44(12): 3044-3052. (in Chinese) | |
5 | HORN B K P, SCHUNCK B G. Determining optical flow[J]. Artificial Intelligence, 1981, 17(1/2/3): 185-203. |
6 | MEI L, LAI J H, XIE X H, et al. Illumination-invariance optical flow estimation using weighted regularization transform[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(2): 495-508. |
7 | ZHANG C X, GE L Y, CHEN Z, et al. Refined TV-L1 optical flow estimation using joint filtering[J]. IEEE Transactions on Multimedia, 2020, 22(2): 349-364. |
8 | SUN D Q, ROTH S, BLACK M J. A quantitative analysis of current practices in optical flow estimation and the principles behind them[J].International Journal of Computer Vision, 2014, 106(2): 115-137. |
9 | ZHANG C X, CHEN Z, WANG M R, et al. Robust non-local TV- L1 optical flow estimation with occlusion detection[J]. IEEE Transactions on Image Processing, 2017, 26(8): 4055-4067. |
10 | ZIMMER H, BRUHN A, WEICKERT J. Optic flow in harmony[J].International Journal of Computer Vision, 2011, 93(3): 368-388. |
11 | SEVILLA-LARA L, SUN D Q, JAMPANI V, et al. Optical flow with semantic segmentation and localized layers[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 3889-3898. |
12 | 张聪炫, 周仲凯, 陈震, 等. 深度学习光流计算技术研究进展[J]. 电子学报, 2020, 48(9): 1841-1849. |
ZHANG C X, ZHOU Z K, CHEN Z, et al. Research progress of deep learning based optical flow computation technology[J]. Acta Electronica Sinica, 2020, 48(9): 1841-1849. (in Chinese) | |
13 | DOSOVITSKIY A, FISCHER P, ILG E, et al. FlowNet: learning optical flow with convolutional networks[C]//2015 IEEE International Conference on Computer Vision. Boston: IEEE, 2015: 2758-2766. |
14 | ILG E, MAYER N, SAIKIA T, et al. FlowNet 2.0: evolution of optical flow estimation with deep networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 1647-1655. |
15 | RANJAN A, BLACK M J. Optical flow estimation using a spatial pyramid network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2720-2729. |
16 | SUN D Q, YANG X D, LIU M Y, et al. PWC-net: CNNs for optical flow using pyramid, warping, and cost volume[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8934-8943. |
17 | HUI T W, TANG X O, LOY C C. LiteFlowNet: A lightweight convolutional neural network for optical flow estimation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8981-8989. |
18 | ZHAO S Y, SHENG Y L, DONG Y, et al. MaskFlownet: asymmetric feature matching with learnable occlusion mask[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 6277-6286. |
19 | LU Y, VALMADRE J, WANG H, et al. Devon: Deformable volume network for learning optical flow[C]//2020 IEEE Winter Conference on Applications of Computer Vision. Snowmass: IEEE Press, 2020. 2705-2713. |
20 | YANG G, RAMANAN D. Volumetric correspondence networks for optical flow[C]//33rd International Conference on Neural Information Processing Systems. Vancouver: Curran Associates Inc, 2019: 794-805. |
21 | YU J J, HARLEY A W, DERPANIS K G. Back to basics: Unsupervised learning of optical flow via brightness constancy and motion smoothness[C]//European Conference on Computer Vision. Cham: Springer, 2016: 3-10. |
22 | MEISTER S, HUR J, ROTH S. UnFlow: Unsupervised learning of optical flow with a bidirectional census loss[C]//AAAI Conference on Artificial Intelligence. New Orleans: AAAI Press, 2018: 7251-7259. |
23 | LIU P P, KING I, LYU M R, et al. DDFlow: learning optical flow with unlabeled data distillation[J]. Proceedings of the AAAI Conference on Artificial Intelligence. Hawaii: AAAI Press, 2019, 33: 8770-8777. |
24 | LAI W S, HUANG J B, YANG M H. Semi-supervised learning for optical flow with Generative Adversarial Networks[C]//31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc, 2017: 353-363. |
25 | SONG X L, ZHAO Y Y, YANG J Y, et al. FPCR-net: Feature pyramidal correlation and residual reconstruction for optical flow estimation[EB/OL]. (2020-01-17). . |
26 | KINGMA D, BA J. Adam: A method for stochastic optimization[C]//International Conference for Learning Representations. San Diego: Elsevier Press, 2015: 1-15. |
27 | MAYER N, ILG E, HÄUSSER P, et al. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4040-4048. |
28 | BUTLER D J, WULFF J, STANLEY G B, et al. A naturalistic open source movie for optical flow evaluation[C]//European Conference on Computer Vision. Berlin, Heidelberg: Springer, 2012: 611-625. |
29 | HU Y L, SONG R, LI Y S. Efficient coarse-to-fine patch match for large displacement optical flow[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 5704-5712. |
30 | REVAUD J, WEINZAEPFEL P, HARCHAOUI Z, et al. EpicFlow: Edge-preserving interpolation of correspondences for optical flow[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1164-1172. |
31 | MENZE M, GEIGER A. Object scene flow for autonomous vehicles[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3061-3070. |
[1] | 李钦, 刘伟, 牛朝阳, 宝音图, 惠周勃. 低信噪比下基于分裂EfficientNet网络的雷达信号调制方式识别[J]. 电子学报, 2023, 51(3): 675-686. |
[2] | 张聿远, 闫文君, 张立民. 基于多模态特征融合网络的空时分组码识别算法[J]. 电子学报, 2023, 51(2): 489-498. |
[3] | 许新征, 李杉. 基于特征膨胀卷积模块的轻量化技术研究[J]. 电子学报, 2023, 51(2): 355-364. |
[4] | 李滔, 董秀成, 林宏伟. 基于深监督跨尺度注意力网络的深度图像超分辨率重建[J]. 电子学报, 2023, 51(1): 128-138. |
[5] | 郭晓轩, 冯其波, 冀振燕, 郑发家, 杨燕燕. 多线激光光条图像缺陷分割模型研究[J]. 电子学报, 2023, 51(1): 172-179. |
[6] | 贾童瑶, 卓力, 李嘉锋, 张菁. 基于深度学习的单幅图像去雾研究进展[J]. 电子学报, 2023, 51(1): 231-245. |
[7] | 何滢婕, 刘月峰, 边浩东, 郭威, 张小燕. 基于Informer的电池荷电状态估算及其稀疏优化方法[J]. 电子学报, 2023, 51(1): 50-56. |
[8] | 张永梅, 孙捷. 基于动静态特征双输入神经网络的咳嗽声诊断COVID-19算法[J]. 电子学报, 2023, 51(1): 202-212. |
[9] | 袁海英, 成君鹏, 曾智勇, 武延瑞. Mobile_BLNet:基于Big-Little Net的轻量级卷积神经网络优化设计[J]. 电子学报, 2023, 51(1): 180-191. |
[10] | 王神龙, 雍宇, 吴晨睿. 基于伪孪生神经网络的低纹理工业零件6D位姿估计[J]. 电子学报, 2023, 51(1): 192-201. |
[11] | 吴靖, 叶晓晶, 黄峰, 陈丽琼, 王志锋, 刘文犀. 基于深度学习的单帧图像超分辨率重建综述[J]. 电子学报, 2022, 50(9): 2265-2294. |
[12] | 李雪莹, 王田路, 梁鹏, 王翀. 基于系统模型的用户评论中非功能需求的自动分类[J]. 电子学报, 2022, 50(9): 2079-2089. |
[13] | 琚长瑞, 秦晓燕, 袁广林, 李豪, 朱虹. 尺度敏感损失与特征融合的快速小目标检测方法[J]. 电子学报, 2022, 50(9): 2119-2126. |
[14] | 张志昌, 于沛霖, 庞雅丽, 朱林, 曾扬扬. SMGN:用于对话状态跟踪的状态记忆图网络[J]. 电子学报, 2022, 50(8): 1851-1858. |
[15] | 张亚洲, 俞洋, 朱少林, 陈锐, 戎璐, 梁辉. 一种量子概率启发的对话讽刺识别网络模型[J]. 电子学报, 2022, 50(8): 1885-1893. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||