基于改进CycleGAN的光学图像迁移生成水下小目标合成孔径声纳图像算法研究

李宝奇; 黄海宁; 刘纪元; 李宇

doi:10.12263/DZXB.20200712

您当前的位置：

首页 >

文章列表页 >

基于改进CycleGAN的光学图像迁移生成水下小目标合成孔径声纳图像算法研究

学术论文 | 更新时间：2025-12-08

- 基于改进CycleGAN的光学图像迁移生成水下小目标合成孔径声纳图像算法研究
- Optical Image-to-Underwater Small Target Synthetic Aperture Sonar Image Translation Algorithm Based on Improved CycleGAN
- 电子学报 2021年49卷第9期页码：1746-1753
- 作者机构：
  
  1.中国科学院声学研究所，北京 100190
  2.中国科学院先进水下信息技术重点实验室，北京 100190
- 作者简介：
  
  [ "李宝奇　男，1985年出生于天津，中国科学院声学研究所特别研究助理.主要从事水下目标探测、识别和跟踪等方面的研究.E⁃mail：libaoqi@mail.ioa.ac.cn" ]
  [ "黄海宁（通信作者）　男，1969年出生于河北，中国科学院声学研究所研究员，博士生导师，国务院政府津贴专家.目前担任中科院海洋信息技术创新研究院暨声学研究所科技委副主任，中科院先进水下信息技术重点实验室暨水声工程中心主任，中国声学学会理事.主要从事水声信号与信息处理、目标探测，水声通信与网络等方面的研究.E⁃mail:hhn@mail.ioa.ac.cn" ]
  [ "刘纪元　男，1963 年出生于辽宁，中国科学院声学研究所研究员，博士生导师.中国图形图像学会视觉与传感专业委员会会员，国家“863”计划“基于无人平台的合成孔径声纳系统研究”项目首席专家.主要研究领域包括水声信号处理、高分辨率水下成像技术等.E⁃mail:ljy@mail.ioa.ac.cn" ]
  [ "李　宇　男，1977年出生贵州，中国科学院声学研究所研究员.研究领域涉及水声信号处理、水声通信与网络、无人平台声纳技术、主动声纳技术、阵列信号处理等的多个方面. E⁃mail:ly@mail.ioa.ac.cn" ]
- 基金信息：
  
  国家自然科学基金(11904386);国家基础科研计划重大项目(JCKY2016206A003);中国科学院青年创新促进会
- DOI：10.12263/DZXB.20200712
  中图分类号： TP391
- 收稿：2020-07-14，
  
  修回：2021-05-17，
  
  纸质出版：2021-09-25
- 稿件说明：
移动端阅览
李宝奇,黄海宁,刘纪元等.基于改进CycleGAN的光学图像迁移生成水下小目标合成孔径声纳图像算法研究[J].电子学报,2021,49(09):1746-1753.

LI Bao-qi,HUANG Hai-ning,LIU Ji-yuan,et al.Optical Image-to-Underwater Small Target Synthetic Aperture Sonar Image Translation Algorithm Based on Improved CycleGAN[J].ACTA ELECTRONICA SINICA,2021,49(09):1746-1753.
李宝奇,黄海宁,刘纪元等.基于改进CycleGAN的光学图像迁移生成水下小目标合成孔径声纳图像算法研究[J].电子学报,2021,49(09):1746-1753. DOI： 10.12263/DZXB.20200712.

LI Bao-qi,HUANG Hai-ning,LIU Ji-yuan,et al.Optical Image-to-Underwater Small Target Synthetic Aperture Sonar Image Translation Algorithm Based on Improved CycleGAN[J].ACTA ELECTRONICA SINICA,2021,49(09):1746-1753. DOI： 10.12263/DZXB.20200712.

摘要

针对循环生成对抗网络CycleGAN（Cycle Generative Adversarial Networks）在光学图像迁移生成水下小目标合成孔径声纳图像过程中存在质量差和速度慢的问题，本文提出一种新的特征提取单元SDK（Selective Dilated Kernel），并利用SDK设计了一个新的生成器网络SDKNet.与此同时，提出了一种新的循环一致损失函数MS-CCLF（Multiscale Cyclic Consistent Loss Function），MS-CCLF增加了图像多尺度结构相似性约束.在自建的图像迁移数据集OPT-SAS上，本文SM-CycleGAN（Selective and Multiscale Cycle Generative Adversarial Networks）比原始CycleGAN的图像迁移质量提升4.64%，生成器网络参数降低4.13MB

运算时间减少0.143s.实验结果表明，SM-CycleGAN更适合水下小目标光学图像到合成孔径声纳图像的迁移任务.

Abstract

The original CycleGAN show poor quality and time consuming in optical image to underwater small target synthetic aperture sonar image translation task. To address those problems

a novel convolution building block

SDK (Selective Dilated Kernel)

is proposed. By stacking SDK blocks

a generator SDKNet is created. At the same time

Multiscale Cycle Consistent Loss Function (MS-CCLF) is proposed

which add the Multiscale Structural Similarity Index (MS-SSIM) between input images and reconstructed images. On our image translation dataset (OPT-SAS)

the classification accuracy of our SM-CycleGAN is 4.64% higher than that of original CycleGAN. The generator parameters of SM-CycleGAN is 4.13MB lower than that of CycleGAN

and the time consuming of SM-CycleGAN is 0.143s less than that of CycleGAN. The experimental results show that SM -CycleGAN is more suitable for the translation task of optical image to small underwater target synthetic aperture sonar image.

关键词

Keywords

references

Hayes M P , Gough P T . Synthetic aperture sonar: a review of current status [J]. IEEE Journal of Oceanic Engineering , 2009 , 34 ( 3 ): 207 - 224 .

Wang P , Chi C , Zhang Y , et al . Fast imaging algorithm for downward-looking 3D synthetic aperture sonars [J]. IET Radar , Sonar and Navigation, 2020 , 14 ( 3 ): 459 - 467 .

刘纪元 , 唐劲松 , 孙宝申 , 等 . 基于回波信号的一种合成孔径声纳运动补偿方法 [J]. 电子学报 , 2003 , 31 ( 1 ): 131 - 134 .

LIU J Y , TANG J S , SUN B S , et al . A receiving-data-based motion compensation method of synthetic aperture sonar [J]. Acta Electronica Sinica , 2003 , 31 ( 1 ): 131 - 134 . (in Chinese)

Sun S B , Chen Y C , Qin L H , et al . Inverse synthetic aperture sonar imaging of underwater vehicles utilizing 3-D rotations [J]. IEEE Journal of Oceanic Engineering , 2020 , 45 ( 2 ): 563 - 576 .

Li Y , Tang S , Zhang R , et al . Asymmetric GAN for unpaired image-to-image translation [J]. IEEE Transactions on Image Processing , 2019 , 28 ( 12 ): 5881 - 5896 .

Lin J X , Xia Y C , Qin T , et al . Conditional image-to-image translation [A]. Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition [C]. Salt Lake City, USA : Institute of Electrical and Electronics Engineering , 2018 . 5524 - 5532 .

Hinton G . Where do features come from? [J]. Cognitive Science , 2014 , 38 ( 6 ): 1078 - 1101 .

LeCun Y , Bengio Y , Hinton G . Deep learning [J]. Nature , 2016 , 521 ( 7553 ): 436 - 444 .

Schmidhuber J . Deep learning in neural networks: an overview [J]. Neural Networks , 2015 , 61 : 85 - 117 .

贺昱曜 , 李宝奇 . 一种组合型的深度学习模型学习率策略 [J]. 自动化学报 , 2016 , 42 ( 6 ): 953 - 958 .

He Y Y , Li B Q . A combinatory form learning rate scheduling for deep learning model [J]. Acta Automatica Sinica , 2016 , 42 ( 6 ): 953 - 958 . (in Chinese)

Goodfellow I J , Pouget-Abadie J , Mirza M , et al . Generative adversarial networks [J]. Advances in Neural Information Processing Systems , 2014 , 3 : 2672 - 2680 .

Radford A , Metz L . Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks [EB/OL]. https://arxiv.org/pdf/1511.06434v1.pdf https://arxiv.org/pdf/1511.06434v1.pdf , 2020-07-14 .

Arjovsky M , Chintala S , Bottou L . Wasserstein GAN [EB/OL]. https://arxiv.org/pdf/1701.07875.pdf https://arxiv.org/pdf/1701.07875.pdf , 2020-07-14

Chen X , Duan Y , Houthooft R , et al . InfoGAN: interpretable representation learning by information maximizing generative adversarial nets [A]. Proceedings of the 2016 Advances in Neural Information Processing Systems [C]. Barcelona, Spain : Neural Information Processing Systems , 2016 . 2172 - 2180 .

Xu Q T , Huang G , Yuan Y , et al . An Empirical Study on Evaluation Metrics of Generative Adversarial Networks [EB/OL]. https://arxiv.org/pdf/1806.07755.pdf https://arxiv.org/pdf/1806.07755.pdf , 2020-07-14 .

Isola P , Zhu J Y , Zhou T , et al . Image⁃to⁃image translation with conditional adversarial networks [A]. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition [C]. Honolulu, HI, USA : Institute of Electrical and Electronics Engineering , 2017 . 5967 - 5976 .

Zhu J Y , Park T , Isola P , et al . Unpaired image⁃to⁃image translation using cycle-consistent adversarial networks [A]. Proceedings of the 2017 IEEE International Conference on Computer Vision [C]. Venice, Italy : IEEE , 2017 . 2242 - 2251 .

He K M , Zhang X Y , Ren S Q , et al . Deep residual learning for image recognition [A]. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition [C]. Las Vegas, NV, USA : IEEE , 2016 . 770 - 778 .

Xie S N , Girshick R , Dollár P , et al . Aggregated residual transformations for deep neural networks [A]. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition [C]. Honolulu, HI, USA : IEEE , 2017 . 5987 - 5995 .

Szegedy C , Ioffe S , Vanhoucke V , et al . Inception-v4 , inception-ResNet and the impact of residual connections on learning [A]. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence [C]. San Francisco California USA : AAAI Press , 2016 . 4278 - 4284

Li X , Wang W H , Hu X L , et al . Selective kernel networks [A]. Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition [C]. Long Beach, CA, USA : IEEE , 2019 . 510 - 519 .

Huang X J , Wen L W , Ding J S . SAR and optical image registration method based on improved CycleGAN [A]. Proceedings of the 2019 Asia-Pacific Conference on Synthetic Aperture Radar [C]. Xiamen, China : IEEE , 2019 . 1 - 6 .

李宝奇 , 贺昱曜 , 强伟 , 等 . 基于并行附加特征提取网络的SSD地面小目标检测模型 [J]. 电子学报 , 2020 , 48 ( 1 ): 84 - 91 .

Li B Q , He Y Y , Qiang W , et al . SSD with parallel additional feature extraction network for ground small target detection [J]. Acta Electronica Sinica , 2020 , 48( 1 ) 84 - 91 . (in Chinese)

Howard A G , Zhu M L , Chen B , et al . MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications [EB/OL]. https://arxiv.org/pdf/1704.04861.pdf https://arxiv.org/pdf/1704.04861.pdf , 2020-07-14 .

Chen L C , Papandreou G , Kokkinos I , et al . DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2018 , 40 ( 4 ): 834 - 848 .

Wang P Q , Chen P F , Yuan Y , et al . Understanding convolution for semantic segmentation [A]. Proceedings of the 2018 IEEE Conference on Applications of Computer Vision [C]. Lake Tahoe, NV : IEEE , 2018 . 1451 - 1460 .

Wang Z , Li Q . Information content weighting for perceptual image quality assessment [J]. IEEE Transactions on Image Processing , 2011 , 20 ( 5 ): 1185 - 1198 .

Choi Y , Choi M , Kim M , et al . StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-image Translation [EB/OL]. https://arxiv.org/pdf/1711.090 20.pdf https://arxiv.org/pdf/1711.09020.pdf , 2020-07-14 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于分频式生成对抗网络的非成对水下图像增强

隐空间采样与隐蔽特征提取的CR-GAN复杂无线信道建模

多色彩通道特征融合的GAN合成图像检测方法

基于离散小波包变换与胶囊生成对抗网络的语音超分辨率算法

基于Transformer动态场景信息生成对抗网络的行人轨迹预测方法