1.中国科学院声学研究所, 北京 100190
2.中国科学院先进水下信息技术重点实验室, 北京 100190
[ "李宝奇 男,1985年出生于天津,中国科学院声学研究所特别研究助理.主要从事水下目标探测、识别和跟踪等方面的研究.E⁃mail:libaoqi@mail.ioa.ac.cn" ]
[ "黄海宁(通信作者) 男,1969年出生于河北,中国科学院声学研究所研究员,博士生导师,国务院政府津贴专家.目前担任中科院海洋信息技术创新研究院暨声学研究所科技委副主任,中科院先进水下信息技术重点实验室暨水声工程中心主任,中国声学学会理事.主要从事水声信号与信息处理、目标探测,水声通信与网络等方面的研究.E⁃mail:hhn@mail.ioa.ac.cn" ]
[ "刘纪元 男,1963 年出生于辽宁,中国科学院声学研究所研究员,博士生导师.中国图形图像学会视觉与传感专业委员会会员,国家“863”计划“基于无人平台的合成孔径声纳系统研究”项目首席专家.主要研究领域包括水声信号处理、高分辨率水下成像技术等.E⁃mail:ljy@mail.ioa.ac.cn" ]
[ "李 宇 男,1977年出生贵州,中国科学院声学研究所研究员.研究领域涉及水声信号处理、水声通信与网络、无人平台声纳技术、主动声纳技术、阵列信号处理等的多个方面. E⁃mail:ly@mail.ioa.ac.cn" ]
收稿:2020-07-14,
修回:2021-05-17,
纸质出版:2021-09-25
移动端阅览
李宝奇,黄海宁,刘纪元等.基于改进CycleGAN的光学图像迁移生成水下小目标合成孔径声纳图像算法研究[J].电子学报,2021,49(09):1746-1753.
LI Bao-qi,HUANG Hai-ning,LIU Ji-yuan,et al.Optical Image-to-Underwater Small Target Synthetic Aperture Sonar Image Translation Algorithm Based on Improved CycleGAN[J].ACTA ELECTRONICA SINICA,2021,49(09):1746-1753.
李宝奇,黄海宁,刘纪元等.基于改进CycleGAN的光学图像迁移生成水下小目标合成孔径声纳图像算法研究[J].电子学报,2021,49(09):1746-1753. DOI: 10.12263/DZXB.20200712.
LI Bao-qi,HUANG Hai-ning,LIU Ji-yuan,et al.Optical Image-to-Underwater Small Target Synthetic Aperture Sonar Image Translation Algorithm Based on Improved CycleGAN[J].ACTA ELECTRONICA SINICA,2021,49(09):1746-1753. DOI: 10.12263/DZXB.20200712.
针对循环生成对抗网络CycleGAN(Cycle Generative Adversarial Networks)在光学图像迁移生成水下小目标合成孔径声纳图像过程中存在质量差和速度慢的问题,本文提出一种新的特征提取单元SDK(Selective Dilated Kernel),并利用SDK设计了一个新的生成器网络SDKNet.与此同时,提出了一种新的循环一致损失函数MS-CCLF(Multiscale Cyclic Consistent Loss Function),MS-CCLF增加了图像多尺度结构相似性约束.在自建的图像迁移数据集OPT-SAS上,本文SM-CycleGAN(Selective and Multiscale Cycle Generative Adversarial Networks)比原始CycleGAN的图像迁移质量提升4.64%,生成器网络参数降低4.13MB
运算时间减少0.143s.实验结果表明,SM-CycleGAN更适合水下小目标光学图像到合成孔径声纳图像的迁移任务.
The original CycleGAN show poor quality and time consuming in optical image to underwater small target synthetic aperture sonar image translation task. To address those problems
a novel convolution building block
SDK (Selective Dilated Kernel)
is proposed. By stacking SDK blocks
a generator SDKNet is created. At the same time
Multiscale Cycle Consistent Loss Function (MS-CCLF) is proposed
which add the Multiscale Structural Similarity Index (MS-SSIM) between input images and reconstructed images. On our image translation dataset (OPT-SAS)
the classification accuracy of our SM-CycleGAN is 4.64% higher than that of original CycleGAN. The generator parameters of SM-CycleGAN is 4.13MB lower than that of CycleGAN
and the time consuming of SM-CycleGAN is 0.143s less than that of CycleGAN. The experimental results show that SM -CycleGAN is more suitable for the translation task of optical image to small underwater target synthetic aperture sonar image.
Hayes M P , Gough P T . Synthetic aperture sonar: a review of current status [J]. IEEE Journal of Oceanic Engineering , 2009 , 34 ( 3 ): 207 - 224 .
Wang P , Chi C , Zhang Y , et al . Fast imaging algorithm for downward-looking 3D synthetic aperture sonars [J]. IET Radar , Sonar and Navigation, 2020 , 14 ( 3 ): 459 - 467 .
刘纪元 , 唐劲松 , 孙宝申 , 等 . 基于回波信号的一种合成孔径声纳运动补偿方法 [J]. 电子学报 , 2003 , 31 ( 1 ): 131 - 134 .
LIU J Y , TANG J S , SUN B S , et al . A receiving-data-based motion compensation method of synthetic aperture sonar [J]. Acta Electronica Sinica , 2003 , 31 ( 1 ): 131 - 134 . (in Chinese)
Sun S B , Chen Y C , Qin L H , et al . Inverse synthetic aperture sonar imaging of underwater vehicles utilizing 3-D rotations [J]. IEEE Journal of Oceanic Engineering , 2020 , 45 ( 2 ): 563 - 576 .
Li Y , Tang S , Zhang R , et al . Asymmetric GAN for unpaired image-to-image translation [J]. IEEE Transactions on Image Processing , 2019 , 28 ( 12 ): 5881 - 5896 .
Lin J X , Xia Y C , Qin T , et al . Conditional image-to-image translation [A]. Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition [C]. Salt Lake City, USA : Institute of Electrical and Electronics Engineering , 2018 . 5524 - 5532 .
Hinton G . Where do features come from? [J]. Cognitive Science , 2014 , 38 ( 6 ): 1078 - 1101 .
LeCun Y , Bengio Y , Hinton G . Deep learning [J]. Nature , 2016 , 521 ( 7553 ): 436 - 444 .
Schmidhuber J . Deep learning in neural networks: an overview [J]. Neural Networks , 2015 , 61 : 85 - 117 .
贺昱曜 , 李宝奇 . 一种组合型的深度学习模型学习率策略 [J]. 自动化学报 , 2016 , 42 ( 6 ): 953 - 958 .
He Y Y , Li B Q . A combinatory form learning rate scheduling for deep learning model [J]. Acta Automatica Sinica , 2016 , 42 ( 6 ): 953 - 958 . (in Chinese)
Goodfellow I J , Pouget-Abadie J , Mirza M , et al . Generative adversarial networks [J]. Advances in Neural Information Processing Systems , 2014 , 3 : 2672 - 2680 .
Radford A , Metz L . Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks [EB/OL]. https://arxiv.org/pdf/1511.06434v1.pdf https://arxiv.org/pdf/1511.06434v1.pdf , 2020-07-14 .
Arjovsky M , Chintala S , Bottou L . Wasserstein GAN [EB/OL]. https://arxiv.org/pdf/1701.07875.pdf https://arxiv.org/pdf/1701.07875.pdf , 2020-07-14
Chen X , Duan Y , Houthooft R , et al . InfoGAN: interpretable representation learning by information maximizing generative adversarial nets [A]. Proceedings of the 2016 Advances in Neural Information Processing Systems [C]. Barcelona, Spain : Neural Information Processing Systems , 2016 . 2172 - 2180 .
Xu Q T , Huang G , Yuan Y , et al . An Empirical Study on Evaluation Metrics of Generative Adversarial Networks [EB/OL]. https://arxiv.org/pdf/1806.07755.pdf https://arxiv.org/pdf/1806.07755.pdf , 2020-07-14 .
Isola P , Zhu J Y , Zhou T , et al . Image⁃to⁃image translation with conditional adversarial networks [A]. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition [C]. Honolulu, HI, USA : Institute of Electrical and Electronics Engineering , 2017 . 5967 - 5976 .
Zhu J Y , Park T , Isola P , et al . Unpaired image⁃to⁃image translation using cycle-consistent adversarial networks [A]. Proceedings of the 2017 IEEE International Conference on Computer Vision [C]. Venice, Italy : IEEE , 2017 . 2242 - 2251 .
He K M , Zhang X Y , Ren S Q , et al . Deep residual learning for image recognition [A]. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition [C]. Las Vegas, NV, USA : IEEE , 2016 . 770 - 778 .
Xie S N , Girshick R , Dollár P , et al . Aggregated residual transformations for deep neural networks [A]. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition [C]. Honolulu, HI, USA : IEEE , 2017 . 5987 - 5995 .
Szegedy C , Ioffe S , Vanhoucke V , et al . Inception-v4 , inception-ResNet and the impact of residual connections on learning [A]. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence [C]. San Francisco California USA : AAAI Press , 2016 . 4278 - 4284
Li X , Wang W H , Hu X L , et al . Selective kernel networks [A]. Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition [C]. Long Beach, CA, USA : IEEE , 2019 . 510 - 519 .
Huang X J , Wen L W , Ding J S . SAR and optical image registration method based on improved CycleGAN [A]. Proceedings of the 2019 Asia-Pacific Conference on Synthetic Aperture Radar [C]. Xiamen, China : IEEE , 2019 . 1 - 6 .
李宝奇 , 贺昱曜 , 强伟 , 等 . 基于并行附加特征提取网络的SSD地面小目标检测模型 [J]. 电子学报 , 2020 , 48 ( 1 ): 84 - 91 .
Li B Q , He Y Y , Qiang W , et al . SSD with parallel additional feature extraction network for ground small target detection [J]. Acta Electronica Sinica , 2020 , 48( 1 ) 84 - 91 . (in Chinese)
Howard A G , Zhu M L , Chen B , et al . MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications [EB/OL]. https://arxiv.org/pdf/1704.04861.pdf https://arxiv.org/pdf/1704.04861.pdf , 2020-07-14 .
Chen L C , Papandreou G , Kokkinos I , et al . DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2018 , 40 ( 4 ): 834 - 848 .
Wang P Q , Chen P F , Yuan Y , et al . Understanding convolution for semantic segmentation [A]. Proceedings of the 2018 IEEE Conference on Applications of Computer Vision [C]. Lake Tahoe, NV : IEEE , 2018 . 1451 - 1460 .
Wang Z , Li Q . Information content weighting for perceptual image quality assessment [J]. IEEE Transactions on Image Processing , 2011 , 20 ( 5 ): 1185 - 1198 .
Choi Y , Choi M , Kim M , et al . StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-image Translation [EB/OL]. https://arxiv.org/pdf/1711.090 20.pdf https://arxiv.org/pdf/1711.09020.pdf , 2020-07-14 .
0
浏览量
18
下载量
7
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621