一种基于改进的MobileNetV2网络语义分割算法

孟琭; 徐磊; 郭嘉阳

doi:10.3969/j.issn.0372-2112.2020.09.015

您当前的位置：

首页 >

文章列表页 >

一种基于改进的MobileNetV2网络语义分割算法

学术论文 | 更新时间：2025-12-08

- 一种基于改进的MobileNetV2网络语义分割算法
- Semantic Segmentation Algorithm Based on Improved MobileNetV2
- 电子学报 2020年48卷第9期页码：1769-1776
- 作者机构：
  
  1. 东北大学信息科学与工程学院,辽宁,沈阳,110000
  2. 辛辛那提大学电气工程与计算机系, 俄亥俄州辛辛那提,45221
  3. 东北大学信息科学与工程学院,辽宁,沈阳,110000
  4. 辛辛那提大学电气工程与计算机系俄亥俄州辛辛那提,45221
- 作者简介：
- 基金信息：
  
  国家自然科学基金 (No.61973058）;教育部中央高校基本科研基金 (No.N2004020）
- DOI：10.3969/j.issn.0372-2112.2020.09.015
  中图分类号： TP391
- 网络出版：2020-09-25，
  
  纸质出版：2020
- 稿件说明：
移动端阅览
孟琭, 徐磊, 郭嘉阳. 一种基于改进的MobileNetV2网络语义分割算法[J]. 电子学报, 2020,48(9):1769-1776.

MENG Lu, XU Lei, GUO Jia-yang. Semantic Segmentation Algorithm Based on Improved MobileNetV2[J]. Acta Electronica Sinica, 2020, 48(9): 1769-1776.
孟琭, 徐磊, 郭嘉阳. 一种基于改进的MobileNetV2网络语义分割算法[J]. 电子学报, 2020,48(9):1769-1776. DOI： 10.3969/j.issn.0372-2112.2020.09.015.

MENG Lu, XU Lei, GUO Jia-yang. Semantic Segmentation Algorithm Based on Improved MobileNetV2[J]. Acta Electronica Sinica, 2020, 48(9): 1769-1776. DOI： 10.3969/j.issn.0372-2112.2020.09.015.

摘要

基于金字塔卷积神经网络的语义分割算法准确率很高，但是其计算资源消耗巨大、算法执行时间长、无法满足实时性要求.为了解决这个问题，本文做出了以下改进：（1）用MobileNet替换原网络的结构，减少了网络运算时间和内存开销；（2）引入编码器-解码器结构提高输出图像的分辨率，进一步细化分割结果；（3）针对高分辨率图像推断时间过长的问题，本文设计了多级图像输入方法，降低了网络推断高分辨率图像所消耗的时间.本文在VOC 2012数据集和Cityscapes数据集上进行了测试，并与FCN、SegNet、DeepLab、PSPNet以及DFN等语义分割模型对比.实验结果表明，本文设计的语义分割算法在VOC 2012数据集上达到了76.1%的mIoU，在Cityscapes数据集上达到了74.1%的mIoU，略低于传统语义分割算法；处理一张分辨率为1024×512的图片需要18ms，少于传统语义分割算法，满足了实时性要求，达到了准确率与计算资源消耗之间的平衡.

Abstract

The algorithm of semantic segmentation based on pyramid convolution neural network has high accuracy

but it consumes a lot of computing resources

takes a long time to execute

and cannot meet the real-time requirements. To overcome these shortcomings

this paper made the following improvements: (1) replacing the original network structure with MobileNet in order to reduce the computation time and memory consumption; (2) using encoder-decoder structure to improve the resolution of the output image and further refine the segmentation results; (3) using a multi-level image input method

which can reduce the time consumed by network inference of high-resolution image. Our method was tested on the VOC 2012 dataset and the Cityscapes dataset compared with other state-of-the-art semantic segmentation models such as FCN (Fully Convolutional Networks)

SegNet

DeepLab

PSPNet and DFN (Discriminative Feature Network). Experimental results showed that our method achieved mIoU of 76.1% on the VOC 2012 dataset

and achieved mIOU of 74.1% on the Cityscapes dataset

which was a little lower than the traditional semantic segmentation algorithms. It took 18ms for our method to predict a 1024×512 picture

which achieved a balance between accuracy and computational resource consumption.

关键词

Keywords

references

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于事件驱动的车道线识别算法研究

DFRNet：融合扩散-聚焦物理机制的语义分割模型研究

联合多视角Transformer编码与在线融合互学习的乳腺癌病理图像分类模型