一种基于数据标准差的卷积神经网络量化方法

黄赟; 张帆; 郭威; 陈立; 羊光

doi:10.12263/DZXB.20210691

您当前的位置：

首页 >

文章列表页 >

一种基于数据标准差的卷积神经网络量化方法

学术论文 | 更新时间：2025-12-08

- 一种基于数据标准差的卷积神经网络量化方法
- A Quantification Method of Convolutional Neural Network Based on Data Standard Deviation
- 电子学报 2023年51卷第3期页码：639-647
- 作者机构：
  
  1.信息工程大学, 河南郑州 450001
  2.国家数字交换系统工程技术研究中心, 河南郑州 450002
  3.河南省广播电视监测中心, 河南郑州 450002
- 作者简介：
  
  [ "黄赟男，1993年9月出生于江西省新余市. 信息工程大学硕士生. 主要研究方向为神经网络模型量化压缩、网络内生安全. E-mail: yyhuangz@163.com" ]
  [ "张帆（通讯作者）男，1981年9月出生. 博士. 现为国家数字交换系统工程技术研究中心副研究员、硕士生导师. 主要研究方向为主动防御、人工智能、高性能计算. 中国电子学会会员编号：E190013697M." ]
  [ "郭威男，1990年8月出生. 博士. 现为国家数字交换系统工程技术研究中心助理研究员. 主要研究方向为主动防御、人工智能、高性能计算. 中国电子学会会员编号：E190029991M.E-mail: guowjss@126.com" ]
  [ "陈立男，1997年2月出生于浙江省义乌市. 信息工程大学硕士生. 主要研究方向为计算机视觉. E-mail: 2464863136@qq.com" ]
  [ "羊　光女，1986年11月出生于河南省驻马店市. 学士. 主要研究方向为网络流量分类、入侵检测、人工智能. E-mail: flyingaki@126.com" ]
- 基金信息：
  
  国家自然科学基金创新研究群体项目(61521003)
- DOI：10.12263/DZXB.20210691
  中图分类号： TP391
- 收稿：2021-05-29，
  
  修回：2021-10-18，
  
  纸质出版：2023-03-25
- 稿件说明：
移动端阅览
黄赟,张帆,郭威等.一种基于数据标准差的卷积神经网络量化方法[J].电子学报,2023,51(03):639-647.

HUANG Yun,ZHANG Fan,GUO Wei,et al.A Quantification Method of Convolutional Neural Network Based on Data Standard Deviation[J].ACTA ELECTRONICA SINICA,2023,51(03):639-647.
黄赟,张帆,郭威等.一种基于数据标准差的卷积神经网络量化方法[J].电子学报,2023,51(03):639-647. DOI： 10.12263/DZXB.20210691.

HUANG Yun,ZHANG Fan,GUO Wei,et al.A Quantification Method of Convolutional Neural Network Based on Data Standard Deviation[J].ACTA ELECTRONICA SINICA,2023,51(03):639-647. DOI： 10.12263/DZXB.20210691.

摘要

当前卷积神经网络模型存在规模过大且运算复杂的问题，难以应用部署在资源受限的计算平台. 针对此问题，本文基于数据标准差提出了一种适合部署在现场可编程门阵列（Field Programmable Gate Array， FPGA）上的对数量化方法. 首先，依据FPGA的特性提出对数量化方法，将32 bit浮点乘法运算转换为整数乘法及移位运算，提高了运算效率. 然后通过研究数据分布特点，提出基于数据标准差的输入量化及权值混合bit量化方法，能够有效减少量化损失. 通过对RepVGG、EfficientNet等网络进行效率与精度对比实验，8 bit量化使得大型神经网络精度仅下降1%左右；输入量化为8 bit，权重量化为10 bit场景下，模型精度损失小于0.2%，达到浮点模型几乎相同的准确率. 实验表明，所提量化方法能够使得模型大小减少75%左右，在基本保持原有模型准确率的同时有效地降低功耗损失、提高运算效率.

Abstract

Due to the large scale of the current convolutional neural network model and complex calculations

it is not suitable for deployment on resource-constrained computing platforms. In order to solve this problem

this paper proposes a logarithmic quantization method based on data standard deviation

which is suitable for deployment on FPGA (Field Programmable Gate Array). According to the characteristics of FPGA

this paper proposes a logarithmic quantization method to convert the 32 bit floating point multiplication operation into integer multiplication and shift operation

which improves the efficiency of the operation. By studying the characteristics of data distribution

the input quantization and mixed bit weight quantization methods based on data standard deviation are proposed

which can effectively reduce the quantization loss. The experimental results show that the accuracy of large-scale neural network is only reduced by about 1% due to 8-bit quantization. When the input is quantized to 8 bits and the weight is quantized to 10 bits

the accuracy loss of the model is less than 0.2%

which is almost the same as that of the floating-point model. Experimental results show that the proposed method can reduce the size of the model by about 75%

and effectively reduce the power loss and improve the computing efficiency while maintaining the accuracy of the original model.

关键词

Keywords

references

KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks [C]// Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 . Red Hook : Curran Associates Inc , 2012 : 1097 - 1105 .

江泽涛 , 秦嘉奇 , 张少钦 . 参数池化卷积神经网络图像分类方法 [J]. 电子学报 , 2020 , 48 ( 9 ): 1729 - 1734 .

JIANG Z T , QIN J Q , ZHANG S Q . Parameterized pooling convolution neural network for image classification [J]. Acta Electronica Sinica , 2020 , 48 ( 9 ): 1729 - 1734 . (in Chinese)

李宝奇 , 贺昱曜 , 强伟 , 等 . 基于并行附加特征提取网络的SSD地面小目标检测模型 [J]. 电子学报 , 2020 , 48 ( 1 ): 84 - 91 .

LI B Q , HE Y Y , QIANG W , et al . SSD with parallel additional feature extraction network for ground small target detection [J]. Acta Electronica Sinica , 2020 , 48 ( 1 ): 84 - 91 . (in Chinese)

罗会兰 , 陈鸿坤 . 基于深度学习的目标检测研究综述 [J]. 电子学报 , 2020 , 48 ( 6 ): 1230 - 1239 .

LUO H L , CHEN H K . Survey of object detection based on deep learning [J]. Acta Electronica Sinica , 2020 , 48 ( 6 ): 1230 - 1239 . (in Chinese)

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [EB/OL]. ( 2014-09-04 ). https://arxiv.org/abs/1409.1556 https://arxiv.org/abs/1409.1556 .

HAN S , POOL J , TRAN J , et al . Learning both weights and connections for efficient neural networks [C]// Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1 . Cambridge, MA, USA : MIT Press , 2015 . 1135 － 1143 .

YE H J , LU S , ZHAN D C . Distilling cross-task knowledge via relationship matching [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 12393 - 12402 .

CAI Y H , YAO Z W , DONG Z , et al . ZeroQ: A novel zero shot quantization framework [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 13166 - 13175 .

DENTON E L , ZAREMBA W , Bruna J , et al . Exploiting linear structure within convolutional networks for efficient evaluation [C]// Proceedings of the Advances in Neural Information Processing Systems . Cambridge, MA, USA : MIT Press , 2014 : 1269 - 1277 .

ZHANG D Q , YANG J L , YE D , et al . LQ-Nets: Learned quantization for highly accurate and compact deep neural networks [C]// European Conference on Computer Vision . Cham : Springer , 2018 : 373 - 390 .

CHOUKROUN Y , KRAVCHIK E , YANG F , et al . Low-bit quantization of neural networks for efficient inference [C]// 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) . Piscataway : IEEE , 2019 : 3009 - 3018 .

刘杰 , 葛一凡 , 田明 , 等 . 基于ZYNQ的可重构卷积神经网络加速器 [J]. 电子学报 , 2021 , 49 ( 4 ): 729 - 735 .

LIU J , GE Y F , TIAN M , et al . Reconfigurable convolutional network accelerator based on ZYNQ [J]. Acta Electronica Sinica , 2021 , 49 ( 4 ): 729 - 735 . (in Chinese)

蹇强 , 张培勇 , 王雪洁 . 一种可配置的CNN协加速器的FPGA实现方法 [J]. 电子学报 , 2019 , 47 ( 7 ): 1525 - 1531 .

JIAN Q , ZHANG P Y , WANG X J . An FPGA implementation method for configurable CNN co-accelerator [J]. Acta Electronica Sinica , 2019 , 47 ( 7 ): 1525 - 1531 . (in Chinese)

KRISHNAMOORTHI R . Quantizing deep convolutional networks for efficient inference: A whitepaper [EB/OL]. ( 2018-06-21 ). https://arxiv.org/abs/1806.08342 https://arxiv.org/abs/1806.08342 .

FANG J , SHAFIEE A , ABDEL-AZIZ H , et al . Post-training Piecewise Linear Quantization for Deep Neural Networks [C]// European Conference on Computer Vision . Cham : Springer , 2020 : 69 - 86 .

ELHOUSHI M , CHEN Z H , SHAFIQ F , et al . DeepShift: towards multiplication-less neural networks [C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Piscataway : IEEE , 2021 : 2359 - 2368 .

NAGEL M , BAALEN M V , BLANKEVOORT T , et al . Data-free quantization through weight equalization and bias correction [C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 1325 - 1334 .

QIN H T , GONG R H , LIU X L , et al . Forward and backward information retention for accurate binary neural networks [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 2247 - 2256 .

IOFFE S , SZEGEDY C . Batch normalization: accelerating deep network training by reducing internal covariate shift [C]// Proceedings of the 32nd International Conference on International Conference on Machine Learning . Lille, France : JMLR.org , 2015 : 448 - 456 .

DENG J , DONG W , SOCHER R , et al . ImageNet: A large-scale hierarchical image database [C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2009 : 248 - 255 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 770 - 778 .

SZEGEDY C , VANHOUCKE V , IOFFE S , et al . Rethinking the inception architecture for computer vision [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 2818 - 2826 .

SANDLER M , HOWARD A , ZHU M L , et al . MobileNetV2: inverted residuals and linear bottlenecks [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 4510 - 4520 .

DING X H , ZHANG X Y , MA N N , et al . RepVGG: making VGG-style ConvNets great again [C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 13728 - 13737 .

TAN Mingxing , LE Quoc V . EfficientNet: Rethinking model scaling for convolutional neural networks [EB/OL]. ( 2019-05-28 ). https://arxiv.org/abs/1905.11946V4 https://arxiv.org/abs/1905.11946V4 .

HAN D , YUN S , HEO B , et al . Rethinking channel dimensions for efficient model design [C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 732 - 741 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于注意力增强的热点感知新闻推荐模型

图像压缩感知的特征域优化及自注意力增强神经网络重构算法

非抽取小波边缘学习深度残差网络的单幅图像超分辨率重建

医学图像关键点检测深度学习方法研究与挑战