电子学报 ›› 2020, Vol. 48 ›› Issue (8): 1472-1478.DOI: 10.3969/j.issn.0372-2112.2020.08.003

• 学术论文 • 上一篇    下一篇

基于二阶池化网络的鲁棒视觉跟踪算法

蒲磊1, 冯新喜2, 侯志强3, 余旺盛2   

  1. 1. 空军工程大学研究生院, 陕西西安 710077;
    2. 空军工程大学信息与导航学院, 陕西西安 710077;
    3. 西安邮电大学计算机学院, 陕西西安 710121
  • 收稿日期:2019-11-17 修回日期:2020-02-21 出版日期:2020-08-25
    • 通讯作者:
    • 蒲磊
    • 作者简介:
    • 冯新喜 男,1964年10月出生,陕西富平人1991年获西北工业大学博士学位,现为空军工程大学信息与导航学院教授、博士研究生导师,主要研究领域为信息融合,信号处理,目标跟踪等. E-mail:fengxinxi2005@aliyun.com
    • 基金资助:
    • 国家自然科学基金 (No.61571458,No.61703423)

Robust Visual Tracking Based on Second Order Pooling Network

PU Lei1, FENG Xin-xi2, HOU Zhi-qiang3, YU Wang-sheng2   

  1. 1. Graduate College, Air Force Engineering University, Xi'an, Shaaxi 710077, China;
    2. Institute of Information and Navigation, Air Force Engineering University, Xi'an, Shaaxi 710077, China;
    3. School of Computer Science and Technology, Xian University of Posts and Telecommunications, Xi'an, Shaaxi 710121, China
  • Received:2019-11-17 Revised:2020-02-21 Online:2020-08-25 Published:2020-08-25
    • Corresponding author:
    • PU Lei
    • Supported by:
    • National Natural Science Foundation of China (No.61571458, No.61703423)

摘要: 针对低分辨率、遮挡以及相似物体干扰等复杂场景下目标易丢失的问题,本文提出了基于二阶池化网络的视觉跟踪算法.已有的方法大多采用一阶池化网络,使得对低分辨目标和相似目标间的区分性不足.对此,本文首先在VGG16网络结构的基础上,将网络最后的一阶池化层替换为二阶协方差池化层,接着在ImageNet和CUB200-2011数据集上对网络进行重新训练.在跟踪阶段,为了减少运算负担,仅提取预训练网络的第四层卷积特征作为目标的外观表征.最后将提取的特征与已有的相关滤波算法进行结合.实验结果表明,本文算法在跟踪精度和成功率上均取得了优异的性能表现.

关键词: 视觉跟踪, 二阶池化网络, 深度特征, 相关滤波

Abstract: Aiming at the problem that the target is easy to lose in the complex scene such as low resolution, occlusion, the interference of similar objects, this paper proposes a visual tracking algorithm based on second-order pooling network. Most of the existing methods use the first-order pooling network, which makes the difference between similar targets insufficient. In this paper, based on the VGG16 network structure, the last first-order pooling layer is replaced by the second-order covariance pooling layer, and then the network is retrained on ImageNet and CUB200-2011 image data sets. In order to reduce the computational burden, only the fourth convolution feature of the pre-training network is extracted as the appearance representation of the target. Finally, the extracted features are combined with the existing correlation filtering algorithm. The experimental results show that the algorithm achieves excellent performance in tracking accuracy and success rate.

Key words: visual tracking, second-order pooling network, deep features, correlation filter

中图分类号: