电子学报 ›› 2019, Vol. 47 ›› Issue (10): 2134-2141.DOI: 10.3969/j.issn.0372-2112.2019.10.015
所属专题: 机器学习之图像处理; 优秀论文(2022)
葛疏雨, 高子淋, 张冰冰, 李培华
收稿日期:
2018-09-03
修回日期:
2019-01-14
出版日期:
2019-10-25
通讯作者:
作者简介:
基金资助:
GE Shu-yu, GAO Zi-lin, ZHANG Bing-bing, LI Pei-hua
Received:
2018-09-03
Revised:
2019-01-14
Online:
2019-10-25
Published:
2019-10-25
Corresponding author:
Supported by:
摘要: 双线性卷积网络(Bilinear CNN,B-CNN)在计算机视觉任务中有着广泛的应用.B-CNN通过对卷积层输出的特征进行外积操作,能够建模不同通道之间的线性相关,从而增强了卷积网络的表达能力.由于没有考虑特征图中通道之间的非线性关系,该方法无法充分利用通道之间所蕴含的更丰富信息.为了解决这一不足,本文提出了一种核化的双线性卷积网络,通过使用核函数的方式有效地建模特征图中通道之间的非线性关系,进一步增强卷积网络的表达能力.本文在三个常用的细粒度数据库CUB-200-2011、FGVC-Aircraft以及Cars上对本文方法进行了验证,实验表明本文方法在三个数据库上均优于同类方法.
中图分类号:
葛疏雨, 高子淋, 张冰冰, 等. 基于核化双线性卷积网络的细粒度图像分类[J]. 电子学报, 2019, 47(10): 2134-2141.
GE Shu-yu, GAO Zi-lin, ZHANG Bing-bing, et al. Kernelized Bilinear CNN Models for Fine-Grained Visual Recognition[J]. Acta Electronica Sinica, 2019, 47(10): 2134-2141.
[1] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[A].Advances in Neural Information Processing Systems[C].Lake Tahoe:NIPS Foundation,2012.1097-1105. [2] DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Florida:IEEE Press,2009.248-255. [3] 柯圣财,赵永威,李弼程,等.基于卷积神经网络和监督核哈希的图像检索方法[J].电子学报,2017,45(1):157-163. KE Sheng-cai,ZHAO Yong-wei,LI Bi-cheng,et al.Image retrieval based on convolutional neural network and kernel-based supervised hashing[J].Acta Electronica Sinica,2017,45(1):157-163.(in Chinese) [4] 王泽宇,吴艳霞,张国印,等.基于空间结构化推理深度融合网络的RGB-D场景解析[J].电子学报,2018,46(5):1253-1258. WANG Ze-yu,WU Yan-xia,ZHANG Guo-yin,et al.RGB-D scene parsing based on spatial structured inference deep fusion networks[J].Acta Electronica Sinica,2018,46(5):1253-1258.(in Chinese) [5] 李康,李亚敏,胡学敏,等.基于卷积神经网络的鲁棒高精度目标跟踪算法[J].电子学报,2018,46(9):2087-2093. LI Kang,LI Ya-min,HU Xue-min,et al.Robust and accurate object tracking algorithm based on convolutional neural network[J].Acta Electronica Sinica,2018,46(9):2087-2093.(in Chinese) [6] 邹承明,罗莹,徐晓龙.基于多特征组合的细粒度图像分类方法[J].计算机应用,2018,38(7):1853-1856,1861. ZOU Cheng-ming,LUO Ying,XU Xiao-long.Fine-grained image classification method based on multi-feature combination[J].Journal of Computer Applications,2018,38(7):1853-1856,1861.(in Chinese) [7] LIN T Y,ROYCHOWDHURY A,MAJI S.Bilinear CNN models for fine-grained visual recognition[A].Proceedings of IEEE International Conference on Computer Vision[C].Santiago:IEEE Press,2015.1449-1457. [8] LI P,XIE J,WANG Q,et al.Is second-order information helpful for large-scale visual recognition[A].Proceedings of IEEE International Conference on Computer Vision[C].Venice:IEEE Press,2017.2070-2078. [9] LIN T Y,MAJI S.Improved bilinear pooling with CNNs[A].British Machine Vision Conference[C].London:British Machine Vision Association,2017.1-12. [10] IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[A].International Conference on Machine Learning[C].Lille:ACM,2015.448-456. [11] MAJI S,RAHTU E,KANNALA J,et al.Fine-Grained Visual Classification of Aircraft[OL].https://arxiv.org/abs/1306.5151,2013. [12] WAH C,BRANSON S,WELINDER P,et al.The Caltech-Ucsd Birds-200-2011 Dataset[R].Technical report,Caltech.2011. [13] KRAUSE J,STARK M,DENG J,et al.3D object representations for fine-grained categorization[A].Proceedings of IEEE International Conference on Computer Vision Workshops[C].Portland:IEEE Press,2013.554-561. [14] GAO Y,BEIJBOM O,ZHANG N,et al.Compact bilinear pooling[A].Proceedings of IEEE Conference on Computer Vision and attern Recognition[C].Las Vegas:IEEE Press,2016.317-326 [15] LI Y,WANG N,LIU J,et al.Factorized bilinear models for image recognition[A].Proceedings of IEEE International Conference on Computer Vision[C].Venice:IEEE Press,2017.2098-2106. [16] CUI Y,ZHOU F,WANG J,et al.Kernel pooling for convolutional neural networks[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE Press,2017.3049-3058. [17] WANG Q,LI P,ZHANG L.G2DeNet:Global Gaussian distribution embedding network and its application to visual recognition[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE 2017.2730-2739. [18] IONESCU C,VANTZOS O,SMINCHISESCU C.Training deep networks with structured layers by matrix backpropagation[OL].https://arxiv.org/abs/1509.07838,2015. [19] LIN H T,LIN C J.A Study on Sigmoid Kernels for SVM and the Training of Non-PSD Kernels by SMO-Type Methods[R].Technical Report,Nat'l Taiwan Univ,2003. [20] VEDALDI A,LENC K.Matconvnet:convolutional neural networks for matlab[A].ACM International Conference on Multimedia[C].Brisbane:ACM,2015.689-692. [21] CHATFIELD K,SIMONYAN K,VEDALDI A,et al.Return of the devil in the details:Delving deep into convolutional nets[A].British Machine Vision Conference[C].Nottingham:British Machine Vision Association,2014.1-12. [22] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[A].International Conference on Learning Representations[C].San Diego,2015.1-14. [23] NAIR V,HINTON G E.Rectified linear units improve restricted boltzmann machines[A].International Conference on Machine Learning[C].Haifa:ACM,2010.807-814. [24] GOU M,XIONG F,CAMPS O,et al.MoNet:Moments embedding network[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Salt Lake City:IEEE Press,2018.3175-3183. [25] KONG S,FOWLKES C.Low-rank bilinear pooling for fine-grained classification[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE Press,2017.7025-7034. [26] FU J,ZHENG H,MEI T.Look closer to see better:Recurrent attention convolutional neural network for fine-grained image recognition[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Honolulu:IEEE Press,2017.4476-4484. [27] MOGHIMI M,BELONGIE S J,SABERIAN M J,et al.Boosted convolutional neural networks[A].British Machine Vision Conference[C].York:British Machine Vision Association,2016.1-13. [28] JADERBERG M,SIMONYAN K,ZISSERMAN A,et al.Spatial transformer networks[A].Advances in neural information processing systems[C].Montreal:MIT Press,2015.2017-2025 [29] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Las Vegas:IEEE Press,2016.770-778. [30] SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[A].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition[C].Boston:IEEE Press,2015.1-9. |
[1] | 陈君毅, 蒋德琛, 王智铭, 曹佳禾, 王勇. 一种基于双维度滤波和自适应定长化的FMCW雷达手势识别算法研究[J]. 电子学报, 2023, (): 1-9. |
[2] | 韩光洁, 赵腾飞, 刘立, 张帆, 徐政伟. 基于多元区域集划分的工业数据流概念漂移检测[J]. 电子学报, 2023, (): 1-11. |
[3] | 彭锦佳, 王辉兵. 基于异构卷积神经网络集成的无监督行人重识别方法[J]. 电子学报, 2023, (): 1-13. |
[4] | 余伶俐, 易倩, 金鸣岳, 周开军. 面向仿射目标识别的几何与仿生融合特征提取方法[J]. 电子学报, 2023, (): 1-12. |
[5] | 郭凯红, 崔明茜, 刘婷婷. 模糊知识测度下图像脉冲噪声去除方法[J]. 电子学报, 2023, (): 1-14. |
[6] | 郑云飞, 王晓兵, 张雄伟, 曹铁勇, 孙蒙. 基于金字塔知识的自蒸馏HRNet目标分割方法[J]. 电子学报, 2023, 51(3): 746-756. |
[7] | 隗昊, 唐焕玲, 周爱, 张益嘉, 陈飞, 鲁明羽. 基于双路分段注意力神经张量网络的临床文本关系抽取[J]. 电子学报, 2023, 51(3): 658-665. |
[8] | 黄赟, 张帆, 郭威, 陈立, 羊光. 一种基于数据标准差的卷积神经网络量化方法[J]. 电子学报, 2023, 51(3): 639-647. |
[9] | 范兵兵, 何庭建, 张聪炫, 陈震, 黎明. 联合遮挡约束与残差补偿的特征金字塔光流计算方法[J]. 电子学报, 2023, 51(3): 648-657. |
[10] | 李杨帅, 彭斐, 韩倩, 李小帅, 解光军. 一种针对QCA电路自动布局布线的混合策略研究[J]. 电子学报, 2023, 51(3): 666-674. |
[11] | 吕杭, 蒋明峰, 李杨, 张鞠成, 王志康. 基于混合时频域特征的卷积神经网络心律失常分类方法的研究[J]. 电子学报, 2023, 51(3): 701-711. |
[12] | 张晶, 王翌歆, 任永功. 统一全局空间表达的脑电信号跨被试情感识别[J]. 电子学报, 2023, (): 1-9. |
[13] | 申铉京, 李涵宇, 黄永平, 王玉. 基于自适应多尺度特征融合网络的车辆检测方法[J]. 电子学报, 2023, (): 1-9. |
[14] | 陈习坤, 杨俊美. 基于离散小波包变换与胶囊生成对抗网络的语音超分辨率算法[J]. 电子学报, 2023, (): 1-11. |
[15] | 张娜, 包梓群, 罗源, 吴彪, 涂小妹. 基于改进Cascade R-CNN算法在目标检测上的应用[J]. 电子学报, 2023, (): 1-11. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||