电子学报 ›› 2019, Vol. 47 ›› Issue (10): 2134-2141.DOI: 10.3969/j.issn.0372-2112.2019.10.015
所属专题: 机器学习之图像处理; 优秀论文(2022)
葛疏雨, 高子淋, 张冰冰, 李培华
收稿日期:
2018-09-03
修回日期:
2019-01-14
出版日期:
2019-10-25
发布日期:
2019-10-25
通讯作者:
李培华
作者简介:
葛疏雨 男,1994年2月出生于安徽宿州.大连理工大学信息与通信工程学院硕士研究生.主要研究方向为计算机视觉、深度学习.E-mail:gsy@mail.dlut.edu.cn;高子淋 女,1995年6月出生于黑龙江哈尔滨.大连理工大学信息与通信工程学院硕士研究生.主要研究方向为深度学习、计算机视觉.E-mail:gzl@mail.dlut.edu.cn;张冰冰 女,1990年5月生于辽宁沈阳.大连理工大学信息与通信工程学院博士研究生.主要研究方向为深度学习、视频行为识别.E-mail:icyzhang@mail.dlut.edu.cn
基金资助:
GE Shu-yu, GAO Zi-lin, ZHANG Bing-bing, LI Pei-hua
Received:
2018-09-03
Revised:
2019-01-14
Online:
2019-10-25
Published:
2019-10-25
摘要: 双线性卷积网络(Bilinear CNN,B-CNN)在计算机视觉任务中有着广泛的应用.B-CNN通过对卷积层输出的特征进行外积操作,能够建模不同通道之间的线性相关,从而增强了卷积网络的表达能力.由于没有考虑特征图中通道之间的非线性关系,该方法无法充分利用通道之间所蕴含的更丰富信息.为了解决这一不足,本文提出了一种核化的双线性卷积网络,通过使用核函数的方式有效地建模特征图中通道之间的非线性关系,进一步增强卷积网络的表达能力.本文在三个常用的细粒度数据库CUB-200-2011、FGVC-Aircraft以及Cars上对本文方法进行了验证,实验表明本文方法在三个数据库上均优于同类方法.
中图分类号:
葛疏雨, 高子淋, 张冰冰, 李培华. 基于核化双线性卷积网络的细粒度图像分类[J]. 电子学报, 2019, 47(10): 2134-2141.
GE Shu-yu, GAO Zi-lin, ZHANG Bing-bing, LI Pei-hua. Kernelized Bilinear CNN Models for Fine-Grained Visual Recognition[J]. Acta Electronica Sinica, 2019, 47(10): 2134-2141.
[1] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[A].Advances in Neural Information Processing Systems[C].Lake Tahoe:NIPS Foundation,2012.1097-1105. [2] DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Florida:IEEE Press,2009.248-255. [3] 柯圣财,赵永威,李弼程,等.基于卷积神经网络和监督核哈希的图像检索方法[J].电子学报,2017,45(1):157-163. KE Sheng-cai,ZHAO Yong-wei,LI Bi-cheng,et al.Image retrieval based on convolutional neural network and kernel-based supervised hashing[J].Acta Electronica Sinica,2017,45(1):157-163.(in Chinese) [4] 王泽宇,吴艳霞,张国印,等.基于空间结构化推理深度融合网络的RGB-D场景解析[J].电子学报,2018,46(5):1253-1258. WANG Ze-yu,WU Yan-xia,ZHANG Guo-yin,et al.RGB-D scene parsing based on spatial structured inference deep fusion networks[J].Acta Electronica Sinica,2018,46(5):1253-1258.(in Chinese) [5] 李康,李亚敏,胡学敏,等.基于卷积神经网络的鲁棒高精度目标跟踪算法[J].电子学报,2018,46(9):2087-2093. LI Kang,LI Ya-min,HU Xue-min,et al.Robust and accurate object tracking algorithm based on convolutional neural network[J].Acta Electronica Sinica,2018,46(9):2087-2093.(in Chinese) [6] 邹承明,罗莹,徐晓龙.基于多特征组合的细粒度图像分类方法[J].计算机应用,2018,38(7):1853-1856,1861. ZOU Cheng-ming,LUO Ying,XU Xiao-long.Fine-grained image classification method based on multi-feature combination[J].Journal of Computer Applications,2018,38(7):1853-1856,1861.(in Chinese) [7] LIN T Y,ROYCHOWDHURY A,MAJI S.Bilinear CNN models for fine-grained visual recognition[A].Proceedings of IEEE International Conference on Computer Vision[C].Santiago:IEEE Press,2015.1449-1457. [8] LI P,XIE J,WANG Q,et al.Is second-order information helpful for large-scale visual recognition[A].Proceedings of IEEE International Conference on Computer Vision[C].Venice:IEEE Press,2017.2070-2078. [9] LIN T Y,MAJI S.Improved bilinear pooling with CNNs[A].British Machine Vision Conference[C].London:British Machine Vision Association,2017.1-12. [10] IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[A].International Conference on Machine Learning[C].Lille:ACM,2015.448-456. [11] MAJI S,RAHTU E,KANNALA J,et al.Fine-Grained Visual Classification of Aircraft[OL].https://arxiv.org/abs/1306.5151,2013. [12] WAH C,BRANSON S,WELINDER P,et al.The Caltech-Ucsd Birds-200-2011 Dataset[R].Technical report,Caltech.2011. [13] KRAUSE J,STARK M,DENG J,et al.3D object representations for fine-grained categorization[A].Proceedings of IEEE International Conference on Computer Vision Workshops[C].Portland:IEEE Press,2013.554-561. [14] GAO Y,BEIJBOM O,ZHANG N,et al.Compact bilinear pooling[A].Proceedings of IEEE Conference on Computer Vision and attern Recognition[C].Las Vegas:IEEE Press,2016.317-326 [15] LI Y,WANG N,LIU J,et al.Factorized bilinear models for image recognition[A].Proceedings of IEEE International Conference on Computer Vision[C].Venice:IEEE Press,2017.2098-2106. [16] CUI Y,ZHOU F,WANG J,et al.Kernel pooling for convolutional neural networks[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE Press,2017.3049-3058. [17] WANG Q,LI P,ZHANG L.G2DeNet:Global Gaussian distribution embedding network and its application to visual recognition[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE 2017.2730-2739. [18] IONESCU C,VANTZOS O,SMINCHISESCU C.Training deep networks with structured layers by matrix backpropagation[OL].https://arxiv.org/abs/1509.07838,2015. [19] LIN H T,LIN C J.A Study on Sigmoid Kernels for SVM and the Training of Non-PSD Kernels by SMO-Type Methods[R].Technical Report,Nat'l Taiwan Univ,2003. [20] VEDALDI A,LENC K.Matconvnet:convolutional neural networks for matlab[A].ACM International Conference on Multimedia[C].Brisbane:ACM,2015.689-692. [21] CHATFIELD K,SIMONYAN K,VEDALDI A,et al.Return of the devil in the details:Delving deep into convolutional nets[A].British Machine Vision Conference[C].Nottingham:British Machine Vision Association,2014.1-12. [22] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[A].International Conference on Learning Representations[C].San Diego,2015.1-14. [23] NAIR V,HINTON G E.Rectified linear units improve restricted boltzmann machines[A].International Conference on Machine Learning[C].Haifa:ACM,2010.807-814. [24] GOU M,XIONG F,CAMPS O,et al.MoNet:Moments embedding network[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Salt Lake City:IEEE Press,2018.3175-3183. [25] KONG S,FOWLKES C.Low-rank bilinear pooling for fine-grained classification[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE Press,2017.7025-7034. [26] FU J,ZHENG H,MEI T.Look closer to see better:Recurrent attention convolutional neural network for fine-grained image recognition[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE Press,2017.4476-4484. [27] MOGHIMI M,BELONGIE S J,SABERIAN M J,et al.Boosted convolutional neural networks[A].British Machine Vision Conference[C].York:British Machine Vision Association,2016.1-13. [28] JADERBERG M,SIMONYAN K,ZISSERMAN A,et al.Spatial transformer networks[A].Advances in neural information processing systems[C].Montreal:MIT Press,2015.2017-2025 [29] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Las Vegas:IEEE Press,2016.770-778. [30] SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Boston:IEEE Press,2015.1-9. |
[1] | 周登文, 李文斌, 李金新, 黄志勇. 一种轻量级的多尺度通道注意图像超分辨率重建网络[J]. 电子学报, 2022, (): 1-12. |
[2] | 孔玮, 刘云, 李辉, 崔雪红, 杨浩冉. 基于全局自适应有向图的行人轨迹预测[J]. 电子学报, 2022, (): 1-12. |
[3] | 肖斌, 陈嘉博, 毕秀丽, 张俊辉, 李伟生, 王国胤, 马旭. 基于一维卷积神经网络与循环神经网络串联的心音分析方法[J]. 电子学报, 2022, (): 1-8. |
[4] | 麻文刚, 张亚东, 郭进. 基于景深先验引导与环境光优化的图像去雾[J]. 电子学报, 2022, (): 1-14. |
[5] | 李超, 黄新宇, 王凯. 基于特征融合和自学习锚框的高分辨率图像小目标检测算法[J]. 电子学报, 2022, (): 1-12. |
[6] | 姜奇, 文悦, 张瑞杰, 魏福山, 马建峰. 面向智能手机的自适应触屏持续认证方案[J]. 电子学报, 2022, 50(5): 1131-1139. |
[7] | 卓亚琦, 魏家辉, 李志欣. 基于双注意模型的图像描述生成方法研究[J]. 电子学报, 2022, 50(5): 1123-1130. |
[8] | 胡振涛, 杨诗博, 胡玉梅, 周林, 金勇, 杨琳琳. 基于变分贝叶斯的分布式融合目标跟踪[J]. 电子学报, 2022, 50(5): 1058-1065. |
[9] | 王相海, 赵晓阳, 王鑫莹, 赵克云, 宋传鸣. 非抽取小波边缘学习深度残差网络的单幅图像超分辨率重建[J]. 电子学报, 2022, (): 1-13. |
[10] | 严莉, 张凯, 徐浩, 韩圣亚, 刘珅岐, 史玉良. 基于图注意力机制和Transformer的异常检测[J]. 电子学报, 2022, 50(4): 900-908. |
[11] | 庾骏, 黄伟, 张晓波, 尹贺峰. 基于松弛Hadamard矩阵的多模态融合哈希方法[J]. 电子学报, 2022, 50(4): 909-920. |
[12] | 袁冠, 邴睿, 刘肖, 代伟, 张艳梅, 蔡卓. 基于时空图神经网络的手势识别[J]. 电子学报, 2022, 50(4): 921-931. |
[13] | 张涛, 张文涛, 代凌, 陈婧怡, 王丽, 魏倩茹. 基于序贯博弈多智能体强化学习的综合模块化航空电子系统重构方法[J]. 电子学报, 2022, 50(4): 954-966. |
[14] | 郭绍陶, 苑玮琦. 基于双高斯纹理滤波模板和极值点韦伯对比度的圆柱锂电池凹坑缺陷检测[J]. 电子学报, 2022, 50(3): 637-642. |
[15] | 刘微容, 米彦春, 杨帆, 张彦, 郭宏林, 刘仲民. 基于多级解码网络的图像修复[J]. 电子学报, 2022, 50(3): 625-636. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||