Semantic Segmentation Algorithm Based on Improved MobileNetV2

MENG Lu; XU Lei; GUO Jia-yang

doi:10.3969/j.issn.0372-2112.2020.09.015

您当前的位置：

首页 >

文章列表页 >

Semantic Segmentation Algorithm Based on Improved MobileNetV2

更新时间：2025-12-08

- Semantic Segmentation Algorithm Based on Improved MobileNetV2
- Acta Electronica Sinica Vol. 48, Issue 9, Pages: 1769-1776(2020)
- 作者机构：
  
  1. 东北大学信息科学与工程学院,辽宁,沈阳,110000
  2. 辛辛那提大学电气工程与计算机系, 俄亥俄州辛辛那提,45221
  3. 东北大学信息科学与工程学院,辽宁,沈阳,110000
  4. 辛辛那提大学电气工程与计算机系俄亥俄州辛辛那提,45221
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China (No.61973058);Fundamental Research Funds for the Central Universities (No.N2004020)
- DOI：10.3969/j.issn.0372-2112.2020.09.015
  CLC： TP391
- Published Online：25 September 2020，
  
  Published：2020
- 稿件说明：
移动端阅览
MENG Lu, XU Lei, GUO Jia-yang. Semantic Segmentation Algorithm Based on Improved MobileNetV2[J]. Acta Electronica Sinica, 2020, 48(9): 1769-1776.
DOI：

MENG Lu, XU Lei, GUO Jia-yang. Semantic Segmentation Algorithm Based on Improved MobileNetV2[J]. Acta Electronica Sinica, 2020, 48(9): 1769-1776. DOI： 10.3969/j.issn.0372-2112.2020.09.015.

摘要

基于金字塔卷积神经网络的语义分割算法准确率很高，但是其计算资源消耗巨大、算法执行时间长、无法满足实时性要求.为了解决这个问题，本文做出了以下改进：（1）用MobileNet替换原网络的结构，减少了网络运算时间和内存开销；（2）引入编码器-解码器结构提高输出图像的分辨率，进一步细化分割结果；（3）针对高分辨率图像推断时间过长的问题，本文设计了多级图像输入方法，降低了网络推断高分辨率图像所消耗的时间.本文在VOC 2012数据集和Cityscapes数据集上进行了测试，并与FCN、SegNet、DeepLab、PSPNet以及DFN等语义分割模型对比.实验结果表明，本文设计的语义分割算法在VOC 2012数据集上达到了76.1%的mIoU，在Cityscapes数据集上达到了74.1%的mIoU，略低于传统语义分割算法；处理一张分辨率为1024×512的图片需要18ms，少于传统语义分割算法，满足了实时性要求，达到了准确率与计算资源消耗之间的平衡.

Abstract

The algorithm of semantic segmentation based on pyramid convolution neural network has high accuracy

but it consumes a lot of computing resources

takes a long time to execute

and cannot meet the real-time requirements. To overcome these shortcomings

this paper made the following improvements: (1) replacing the original network structure with MobileNet in order to reduce the computation time and memory consumption; (2) using encoder-decoder structure to improve the resolution of the output image and further refine the segmentation results; (3) using a multi-level image input method

which can reduce the time consumed by network inference of high-resolution image. Our method was tested on the VOC 2012 dataset and the Cityscapes dataset compared with other state-of-the-art semantic segmentation models such as FCN (Fully Convolutional Networks)

SegNet

DeepLab

PSPNet and DFN (Discriminative Feature Network). Experimental results showed that our method achieved mIoU of 76.1% on the VOC 2012 dataset

and achieved mIOU of 74.1% on the Cityscapes dataset

which was a little lower than the traditional semantic segmentation algorithms. It took 18ms for our method to predict a 1024×512 picture

which achieved a balance between accuracy and computational resource consumption.

关键词

Keywords

references

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Research on Event‑Driven Lane Recognition Algorithms

DFRNet: A Semantic Segmentation Method Inspired with Physical Mechanism of Diffusion-Focus

Breast Cancer Pathological Image Classification Model via Combining Multi-View Transformer Coding and Online Fusion Mutual Learning

Related Author

XU Pin-jie

CHEN Yi-jie

LI Zhi-nan

ZHAO Di

HUANG Yi-sha

JIANG Lin

GUAN Ya-fei

ZHANG Ya-sha

Related Institution

Institute of Computing Technology， Chinese Academy of Sciences

School of Information and Communication Engineering， Beijing University of Posts and Telecommunications

International School， Beijing University of Posts and Telecommunications

University of Chinese Academy of Sciences

School of Mathematics and Statistics, Hunan University of Technology and Business

⁰