电子学报 ›› 2019, Vol. 47 ›› Issue (5): 1058-1064.DOI: 10.3969/j.issn.0372-2112.2019.05.012

所属专题: 优秀论文(2022)

• 学术论文 • 上一篇    下一篇

基于全卷积神经网络的非对称并行语义分割模型

李宝奇, 贺昱曜, 何灵蛟, 强伟   

  1. 西北工业大学航海学院, 陕西西安 710072
  • 收稿日期:2018-06-04 修回日期:2019-01-17 出版日期:2019-05-25 发布日期:2019-05-25
  • 通讯作者: 李宝奇
  • 作者简介:贺昱曜 男,1956年生,陕西富平人.教授,西北工业大学博士生导师,主要研究方向:精确制导预仿真,智能控制与智能优化理论,图像处理理论与算法.E-mail:heyyao@nwpu.edu.cn;何灵蛟 男,1994年12月生,甘肃会宁人.现于西北工业大学航海学院攻读硕士学位,研究方向为图像增强、图像语义分割及目标检测与识别.E-mail:helingjiao@mail.nwpu.edu.cn;强伟 男,1986年12月生,陕西延安人.现于西北工业大学航海学院攻读硕士学位,研究方向为图像分类、图像语义分割及目标检测与识别等深度学习理论.E-mail:xgd2017qw@mail.nwpu.edu.cn
  • 基金资助:
    国家自然科学基金(No.61271143)

Asymmetric Parallel Semantic Segmentation Model Based on Full Convolutional Neural Network

LI Bao-qi, HE Yu-yao, HE Ling-jiao, QIANG Wei   

  1. School of Marine Science and Technology, Northwestern Polytechnical University, Xi'an, Shaanxi 710072, China
  • Received:2018-06-04 Revised:2019-01-17 Online:2019-05-25 Published:2019-05-25

摘要: 针对RGB图像具有丰富的色彩细节特征,红外图像对目标轮廓、尺寸、边界等外形特征有较高敏感度的特点,提出了一种非对称并行语义分割模型APFCN(Asymmetric Parallelism Fully Convolutional Networks).APFCN上路设计了一个卷积核尺寸非统一的五层空洞卷积网络来提取红外图像目标高层轮廓特征;下路沿用卷积加池化网络提取RGB图像三个尺度上的细节特征;后端将红外图像高层特征与RGB图像三个尺度的细节特征进行融合,并将4倍上采样后的融合特征作为语义分割输出.结果表明,APFCN在像素精度和交并比等方面均优于FCN(输入为RGB图像或红外图像),适用于背景一致下地面目标的语义分割任务.

关键词: 语义分割, 全卷积神经网络, 非对称并行全卷积神经网络, 空洞卷积, 空洞率

Abstract: Aiming at that RGB image is rich in color details of scene and infrared image is sensitive to outline、size and boundary of target,a novel semantic segmentation model APFCN (Asymmetric Parallelism Fully Convolutional Networks) is proposed.In the upper part of APFCN,a five layer dilation convolution network,where the five kernel sizes are not uniform,is designed used to extract the high-level targets contour features of infrared image.In the lower part of APFCN,a classical CNN network is used to extract three scale features of RGB images.At the back of APFCN,the high level features of the infrared image are fused with the three scale features of the RGB image,and the fused features after 4 times upper sampling is used as the semantic segmentation output of APFCN.The results show that APFCN is better than FCN (input RGB image or infrared image) in PA (Pixel Accuracy) and MIoU (Mean Intersection over Union).APFCN is suitable for the semantic segmentation task of ground targets with consistent background colors.

Key words: semantic segmentation, fully convolution neural network, asymmetric parallelism fully convolutional networks, dilation convolution, dilation rate

中图分类号: