电子学报 ›› 2019, Vol. 47 ›› Issue (10): 2134-2141.DOI: 10.3969/j.issn.0372-2112.2019.10.015
所属专题: 机器学习之图像处理; 优秀论文(2022)
葛疏雨, 高子淋, 张冰冰, 李培华
收稿日期:
2018-09-03
修回日期:
2019-01-14
出版日期:
2019-10-25
通讯作者:
作者简介:
基金资助:
GE Shu-yu, GAO Zi-lin, ZHANG Bing-bing, LI Pei-hua
Received:
2018-09-03
Revised:
2019-01-14
Online:
2019-10-25
Published:
2019-10-25
摘要: 双线性卷积网络(Bilinear CNN,B-CNN)在计算机视觉任务中有着广泛的应用.B-CNN通过对卷积层输出的特征进行外积操作,能够建模不同通道之间的线性相关,从而增强了卷积网络的表达能力.由于没有考虑特征图中通道之间的非线性关系,该方法无法充分利用通道之间所蕴含的更丰富信息.为了解决这一不足,本文提出了一种核化的双线性卷积网络,通过使用核函数的方式有效地建模特征图中通道之间的非线性关系,进一步增强卷积网络的表达能力.本文在三个常用的细粒度数据库CUB-200-2011、FGVC-Aircraft以及Cars上对本文方法进行了验证,实验表明本文方法在三个数据库上均优于同类方法.
中图分类号:
葛疏雨, 高子淋, 张冰冰, 李培华. 基于核化双线性卷积网络的细粒度图像分类[J]. 电子学报, 2019, 47(10): 2134-2141.
GE Shu-yu, GAO Zi-lin, ZHANG Bing-bing, LI Pei-hua. Kernelized Bilinear CNN Models for Fine-Grained Visual Recognition[J]. Acta Electronica Sinica, 2019, 47(10): 2134-2141.
[1] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[A].Advances in Neural Information Processing Systems[C].Lake Tahoe:NIPS Foundation,2012.1097-1105. [2] DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Florida:IEEE Press,2009.248-255. [3] 柯圣财,赵永威,李弼程,等.基于卷积神经网络和监督核哈希的图像检索方法[J].电子学报,2017,45(1):157-163. KE Sheng-cai,ZHAO Yong-wei,LI Bi-cheng,et al.Image retrieval based on convolutional neural network and kernel-based supervised hashing[J].Acta Electronica Sinica,2017,45(1):157-163.(in Chinese) [4] 王泽宇,吴艳霞,张国印,等.基于空间结构化推理深度融合网络的RGB-D场景解析[J].电子学报,2018,46(5):1253-1258. WANG Ze-yu,WU Yan-xia,ZHANG Guo-yin,et al.RGB-D scene parsing based on spatial structured inference deep fusion networks[J].Acta Electronica Sinica,2018,46(5):1253-1258.(in Chinese) [5] 李康,李亚敏,胡学敏,等.基于卷积神经网络的鲁棒高精度目标跟踪算法[J].电子学报,2018,46(9):2087-2093. LI Kang,LI Ya-min,HU Xue-min,et al.Robust and accurate object tracking algorithm based on convolutional neural network[J].Acta Electronica Sinica,2018,46(9):2087-2093.(in Chinese) [6] 邹承明,罗莹,徐晓龙.基于多特征组合的细粒度图像分类方法[J].计算机应用,2018,38(7):1853-1856,1861. ZOU Cheng-ming,LUO Ying,XU Xiao-long.Fine-grained image classification method based on multi-feature combination[J].Journal of Computer Applications,2018,38(7):1853-1856,1861.(in Chinese) [7] LIN T Y,ROYCHOWDHURY A,MAJI S.Bilinear CNN models for fine-grained visual recognition[A].Proceedings of IEEE International Conference on Computer Vision[C].Santiago:IEEE Press,2015.1449-1457. [8] LI P,XIE J,WANG Q,et al.Is second-order information helpful for large-scale visual recognition[A].Proceedings of IEEE International Conference on Computer Vision[C].Venice:IEEE Press,2017.2070-2078. [9] LIN T Y,MAJI S.Improved bilinear pooling with CNNs[A].British Machine Vision Conference[C].London:British Machine Vision Association,2017.1-12. [10] IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[A].International Conference on Machine Learning[C].Lille:ACM,2015.448-456. [11] MAJI S,RAHTU E,KANNALA J,et al.Fine-Grained Visual Classification of Aircraft[OL].https://arxiv.org/abs/1306.5151,2013. [12] WAH C,BRANSON S,WELINDER P,et al.The Caltech-Ucsd Birds-200-2011 Dataset[R].Technical report,Caltech.2011. [13] KRAUSE J,STARK M,DENG J,et al.3D object representations for fine-grained categorization[A].Proceedings of IEEE International Conference on Computer Vision Workshops[C].Portland:IEEE Press,2013.554-561. [14] GAO Y,BEIJBOM O,ZHANG N,et al.Compact bilinear pooling[A].Proceedings of IEEE Conference on Computer Vision and attern Recognition[C].Las Vegas:IEEE Press,2016.317-326 [15] LI Y,WANG N,LIU J,et al.Factorized bilinear models for image recognition[A].Proceedings of IEEE International Conference on Computer Vision[C].Venice:IEEE Press,2017.2098-2106. [16] CUI Y,ZHOU F,WANG J,et al.Kernel pooling for convolutional neural networks[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE Press,2017.3049-3058. [17] WANG Q,LI P,ZHANG L.G2DeNet:Global Gaussian distribution embedding network and its application to visual recognition[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE 2017.2730-2739. [18] IONESCU C,VANTZOS O,SMINCHISESCU C.Training deep networks with structured layers by matrix backpropagation[OL].https://arxiv.org/abs/1509.07838,2015. [19] LIN H T,LIN C J.A Study on Sigmoid Kernels for SVM and the Training of Non-PSD Kernels by SMO-Type Methods[R].Technical Report,Nat'l Taiwan Univ,2003. [20] VEDALDI A,LENC K.Matconvnet:convolutional neural networks for matlab[A].ACM International Conference on Multimedia[C].Brisbane:ACM,2015.689-692. [21] CHATFIELD K,SIMONYAN K,VEDALDI A,et al.Return of the devil in the details:Delving deep into convolutional nets[A].British Machine Vision Conference[C].Nottingham:British Machine Vision Association,2014.1-12. [22] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[A].International Conference on Learning Representations[C].San Diego,2015.1-14. [23] NAIR V,HINTON G E.Rectified linear units improve restricted boltzmann machines[A].International Conference on Machine Learning[C].Haifa:ACM,2010.807-814. [24] GOU M,XIONG F,CAMPS O,et al.MoNet:Moments embedding network[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Salt Lake City:IEEE Press,2018.3175-3183. [25] KONG S,FOWLKES C.Low-rank bilinear pooling for fine-grained classification[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE Press,2017.7025-7034. [26] FU J,ZHENG H,MEI T.Look closer to see better:Recurrent attention convolutional neural network for fine-grained image recognition[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE Press,2017.4476-4484. [27] MOGHIMI M,BELONGIE S J,SABERIAN M J,et al.Boosted convolutional neural networks[A].British Machine Vision Conference[C].York:British Machine Vision Association,2016.1-13. [28] JADERBERG M,SIMONYAN K,ZISSERMAN A,et al.Spatial transformer networks[A].Advances in neural information processing systems[C].Montreal:MIT Press,2015.2017-2025 [29] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Las Vegas:IEEE Press,2016.770-778. [30] SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Boston:IEEE Press,2015.1-9. |
[1] | 王子为, 鲁继文, 周杰. 基于自适应梯度优化的二值神经网络[J]. 电子学报, 2023, (): 1-10. |
[2] | 刘金平, 吴娟娟, 张荣, 徐鹏飞. 基于结构重参数化与多尺度深度监督的COVID-19胸部CT图像自动分割[J]. 电子学报, 2023, (): 1-9. |
[3] | 张笑宇, 沈超, 蔺琛皓, 李前, 王骞, 李琦, 管晓宏. 面向机器学习模型安全的测试与修复[J]. 电子学报, 2023, (): 1-35. |
[4] | 王炼红, 罗志辉, 林飞鹏, 李潇瑶. 采用多头注意力机制的C&RM-MAKT预测算法[J]. 电子学报, 2022, (): 1-9. |
[5] | 苏田田, 王慧敏, 张小凤. 基于多分支瓶颈结构的轻量型图像分类算法研究[J]. 电子学报, 2022, (): 1-9. |
[6] | 刘芳, 朱天贺, 苏卫星, 刘阳. 基于高斯隐马尔可夫模型的人机共享控制区域化决策算法[J]. 电子学报, 2022, 50(11): 2659-2667. |
[7] | 桑海峰, 陈旺兴, 王海峰, 王金玉. 基于多模式时空交互的行人轨迹预测模型[J]. 电子学报, 2022, 50(11): 2806-2812. |
[8] | 刘耿耿, 李泽鹏, 郭文忠, 陈国龙, 徐宁. 面向超大规模集成电路物理设计的通孔感知的并行层分配算法[J]. 电子学报, 2022, 50(11): 2575-2583. |
[9] | 魏博文, 全红艳. 基于语义与形态特征融合的语义分割网络[J]. 电子学报, 2022, 50(11): 2688-2697. |
[10] | 姚睿, 朱享彬, 周勇, 王鹏, 张艳宁, 赵佳琦. 基于重要特征的视觉目标跟踪可迁移黑盒攻击方法[J]. 电子学报, 2022, (): 1-10. |
[11] | 金紫凤, 潘思聪, 危辉. 可变环境下基于位姿变换矩阵的机器人无标定手眼协调方法[J]. 电子学报, 2022, 50(10): 2318-2328. |
[12] | 魏钰轩, 陈莹. 基于自适应层信息熵的卷积神经网络压缩[J]. 电子学报, 2022, 50(10): 2398-2408. |
[13] | 马百腾, 张士伟, 高常鑫, 桑农. 面向行为边界框生成的端到端时间全局相关网络[J]. 电子学报, 2022, 50(10): 2452-2461. |
[14] | 肖斌, 陈嘉博, 毕秀丽, 张俊辉, 李伟生, 王国胤, 马旭. 基于一维卷积神经网络与循环神经网络串联的心音分析方法[J]. 电子学报, 2022, 50(10): 2425-2432. |
[15] | 周登文, 李文斌, 李金新, 黄志勇. 一种轻量级的多尺度通道注意图像超分辨率重建网络[J]. 电子学报, 2022, 50(10): 2336-2346. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||