电子学报 ›› 2023, Vol. 51 ›› Issue (3): 639-647.DOI: 10.12263/DZXB.20210691
黄赟1, 张帆2, 郭威2, 陈立1, 羊光3
收稿日期:
2021-05-29
修回日期:
2021-10-18
出版日期:
2023-03-25
通讯作者:
作者简介:
基金资助:
HUANG Yun1, ZHANG Fan2, GUO Wei2, CHEN Li1, YANG Guang3
Received:
2021-05-29
Revised:
2021-10-18
Online:
2023-03-25
Published:
2023-04-20
Corresponding author:
Supported by:
摘要:
当前卷积神经网络模型存在规模过大且运算复杂的问题,难以应用部署在资源受限的计算平台. 针对此问题,本文基于数据标准差提出了一种适合部署在现场可编程门阵列(Field Programmable Gate Array, FPGA)上的对数量化方法. 首先,依据FPGA的特性提出对数量化方法,将32 bit浮点乘法运算转换为整数乘法及移位运算,提高了运算效率. 然后通过研究数据分布特点,提出基于数据标准差的输入量化及权值混合bit量化方法,能够有效减少量化损失. 通过对RepVGG、EfficientNet等网络进行效率与精度对比实验,8 bit量化使得大型神经网络精度仅下降1%左右;输入量化为8 bit,权重量化为10 bit场景下,模型精度损失小于0.2%,达到浮点模型几乎相同的准确率. 实验表明,所提量化方法能够使得模型大小减少75%左右,在基本保持原有模型准确率的同时有效地降低功耗损失、提高运算效率.
中图分类号:
黄赟, 张帆, 郭威, 等. 一种基于数据标准差的卷积神经网络量化方法[J]. 电子学报, 2023, 51(3): 639-647.
Yun HUANG, Fan ZHANG, Wei GUO, et al. A Quantification Method of Convolutional Neural Network Based on Data Standard Deviation[J]. Acta Electronica Sinica, 2023, 51(3): 639-647.
模型 | 位宽(w/a/b)/bit | 准确率/% | |
---|---|---|---|
Top1 | Top5 | ||
Resnet50 | 32/32/32 | 75.202 | 92.194 |
16/32/32 | 75.212 | 92.196 | |
16/16/16 | 74.912 | 91.91 | |
10/32/32 | 75.204 | 92.196 | |
8/32/32 | 74.62 | 91.886 | |
6/32/32 | 60.084 | 82.55 | |
8/32/8 | 74.526 | 91.866 | |
32/8/32 | 74.842 | 92.03 | |
8/8/8 | 74.242 | 91.588 | |
4/32/32 | 0.094 | 0.512 | |
Inception V3 | 32/32/32 | 77.978 | 93.942 |
16/32/32 | 77.980 | 93.944 | |
16/16/16 | 77.982 | 93.944 | |
10/32/32 | 77.96 | 93.88 | |
8/32/32 | 77.052 | 93.38 | |
6/32/32 | 50.302 | 73.446 | |
8/32/8 | 76.798 | 93.25 | |
32/8/32 | 77.728 | 93.812 | |
8/8/8 | 76.584 | 93.204 | |
4/32/32 | 0.144 | 0.544 |
表1 Resnet50和Inception v3网络对数量化结果
模型 | 位宽(w/a/b)/bit | 准确率/% | |
---|---|---|---|
Top1 | Top5 | ||
Resnet50 | 32/32/32 | 75.202 | 92.194 |
16/32/32 | 75.212 | 92.196 | |
16/16/16 | 74.912 | 91.91 | |
10/32/32 | 75.204 | 92.196 | |
8/32/32 | 74.62 | 91.886 | |
6/32/32 | 60.084 | 82.55 | |
8/32/8 | 74.526 | 91.866 | |
32/8/32 | 74.842 | 92.03 | |
8/8/8 | 74.242 | 91.588 | |
4/32/32 | 0.094 | 0.512 | |
Inception V3 | 32/32/32 | 77.978 | 93.942 |
16/32/32 | 77.980 | 93.944 | |
16/16/16 | 77.982 | 93.944 | |
10/32/32 | 77.96 | 93.88 | |
8/32/32 | 77.052 | 93.38 | |
6/32/32 | 50.302 | 73.446 | |
8/32/8 | 76.798 | 93.25 | |
32/8/32 | 77.728 | 93.812 | |
8/8/8 | 76.584 | 93.204 | |
4/32/32 | 0.144 | 0.544 |
Model | 原始模型准确率/% | 对数量化准确率/% | 基于标准差的输入量化准确率/% | 降低的量化损失/% | ||||
---|---|---|---|---|---|---|---|---|
Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | |
Resnet50 | 75.202 | 92.194 | 74.842 | 92.03 | 75.164 | 92.038 | 82.56 | 4.88 |
Inception V3 | 77.978 | 93.942 | 77.728 | 93.812 | 77.914 | 93.838 | 75 | 2 |
表2 基于标准差的输入量化结果
Model | 原始模型准确率/% | 对数量化准确率/% | 基于标准差的输入量化准确率/% | 降低的量化损失/% | ||||
---|---|---|---|---|---|---|---|---|
Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | |
Resnet50 | 75.202 | 92.194 | 74.842 | 92.03 | 75.164 | 92.038 | 82.56 | 4.88 |
Inception V3 | 77.978 | 93.942 | 77.728 | 93.812 | 77.914 | 93.838 | 75 | 2 |
Model | 原始模型准确率/% | 对数量化准确率/% | 基于标准差的混合bit量化准确率/% | |||
---|---|---|---|---|---|---|
Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | |
Resnet50 | 75.202 | 92.194 | 74.62 | 91.886 | 73.804* | 91.726* |
74.866 | 92.088 | |||||
Inception V3 | 77.978 | 93.942 | 77.052 | 93.38 | 76.274* | 92.993* |
77.284 | 93.566 |
表3 基于标准差的权值混合bit量化结果
Model | 原始模型准确率/% | 对数量化准确率/% | 基于标准差的混合bit量化准确率/% | |||
---|---|---|---|---|---|---|
Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | |
Resnet50 | 75.202 | 92.194 | 74.62 | 91.886 | 73.804* | 91.726* |
74.866 | 92.088 | |||||
Inception V3 | 77.978 | 93.942 | 77.052 | 93.38 | 76.274* | 92.993* |
77.284 | 93.566 |
Model | 原始模型准确率/% | 基于标准差的对数量化准确率/% | 量化误差/% | 模型大小 | |||||
---|---|---|---|---|---|---|---|---|---|
Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | 量化前/MB | 量化后/MB | 压缩率/% | |
Resnet50 | 75.202 | 92.194 | 74.406 | 91.862 | 0.796 | 0.332 | 97.84 | 24.51 | 74.95 |
Inception V3 | 77.978 | 93.942 | 76.984 | 93.484 | 0.994 | 0.458 | 91.23 | 22.82 | 74.98 |
VGG16 | 70.894 | 89.848 | 70.528 | 89.75 | 0.366 | 0.098 | 527.81 | 131.96 | 74.99 |
RexNet3 | 82.63 | 96.25 | 82.09 | 96.01 | 0.54 | 0.24 | 132.95 | 33.30 | 74.95 |
RepVGG A0 | 72.42 | 90.49 | 71.40 | 90.08 | 1.02 | 0.41 | 34.89 | 8.75 | 74.92 |
表4 基于标准差的对数量化结果(1)
Model | 原始模型准确率/% | 基于标准差的对数量化准确率/% | 量化误差/% | 模型大小 | |||||
---|---|---|---|---|---|---|---|---|---|
Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | 量化前/MB | 量化后/MB | 压缩率/% | |
Resnet50 | 75.202 | 92.194 | 74.406 | 91.862 | 0.796 | 0.332 | 97.84 | 24.51 | 74.95 |
Inception V3 | 77.978 | 93.942 | 76.984 | 93.484 | 0.994 | 0.458 | 91.23 | 22.82 | 74.98 |
VGG16 | 70.894 | 89.848 | 70.528 | 89.75 | 0.366 | 0.098 | 527.81 | 131.96 | 74.99 |
RexNet3 | 82.63 | 96.25 | 82.09 | 96.01 | 0.54 | 0.24 | 132.95 | 33.30 | 74.95 |
RepVGG A0 | 72.42 | 90.49 | 71.40 | 90.08 | 1.02 | 0.41 | 34.89 | 8.75 | 74.92 |
Model | 原始模型准确率/% | 基于标准差的对数量化准确率/% | 量化误差/% | |||
---|---|---|---|---|---|---|
Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | |
Resnet50 | 75.202 | 92.194 | 75.052 | 92.1 | 0.15 | 0.094 |
Inception V3 | 77.978 | 93.942 | 77.814 | 93.812 | 0.164 | 0.13 |
VGG16 | 70.894 | 89.848 | 70.742 | 89.778 | 0.152 | 0.07 |
RexNet3 | 82.63 | 96.25 | 82.49 | 96.16 | 0.14 | 0.11 |
RepVGG A0 | 72.42 | 90.49 | 77.20 | 90.34 | 0.22 | 0.15 |
表5 基于标准差的对数量化结果(2)
Model | 原始模型准确率/% | 基于标准差的对数量化准确率/% | 量化误差/% | |||
---|---|---|---|---|---|---|
Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | |
Resnet50 | 75.202 | 92.194 | 75.052 | 92.1 | 0.15 | 0.094 |
Inception V3 | 77.978 | 93.942 | 77.814 | 93.812 | 0.164 | 0.13 |
VGG16 | 70.894 | 89.848 | 70.742 | 89.778 | 0.152 | 0.07 |
RexNet3 | 82.63 | 96.25 | 82.49 | 96.16 | 0.14 | 0.11 |
RepVGG A0 | 72.42 | 90.49 | 77.20 | 90.34 | 0.22 | 0.15 |
Model | 权重位宽/bit | 原始模型准确率/% | 基于标准差的对数量化准确率/% | 量化误差/% | 模型大小 | |||||
---|---|---|---|---|---|---|---|---|---|---|
Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | 量化前 /MB | 量化后 /MB | 压缩率/% | ||
MobileNet v2 | 8 | 70.126 | 89.532 | 0.266 | 1.184 | / | / | 13.6 | / | / |
12 | 68.298 | 88.44 | 1.828 | 1.092 | / | / | ||||
16 | 69.84 | 89.31 | 0.286 | 0.222 | 6.74 | 50.44 | ||||
EfficientNet-b0 | 8 | 76.63 | 93.025 | 24.854 | 44.864 | / | / | 20.3 | 8.99 | 55.714 |
12 | 76.054 | 92.83 | 0.576 | 0.195 | / | / | ||||
16 | 76.302 | 92.882 | 0.328 | 0.143 | 12.76 | 37.143 |
表6 轻量型网络量化结果
Model | 权重位宽/bit | 原始模型准确率/% | 基于标准差的对数量化准确率/% | 量化误差/% | 模型大小 | |||||
---|---|---|---|---|---|---|---|---|---|---|
Top1 | Top5 | Top1 | Top5 | Top1 | Top5 | 量化前 /MB | 量化后 /MB | 压缩率/% | ||
MobileNet v2 | 8 | 70.126 | 89.532 | 0.266 | 1.184 | / | / | 13.6 | / | / |
12 | 68.298 | 88.44 | 1.828 | 1.092 | / | / | ||||
16 | 69.84 | 89.31 | 0.286 | 0.222 | 6.74 | 50.44 | ||||
EfficientNet-b0 | 8 | 76.63 | 93.025 | 24.854 | 44.864 | / | / | 20.3 | 8.99 | 55.714 |
12 | 76.054 | 92.83 | 0.576 | 0.195 | / | / | ||||
16 | 76.302 | 92.882 | 0.328 | 0.143 | 12.76 | 37.143 |
1 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1. Red Hook: Curran Associates Inc, 2012: 1097-1105. |
2 | 江泽涛, 秦嘉奇, 张少钦. 参数池化卷积神经网络图像分类方法[J]. 电子学报, 2020, 48(9): 1729-1734. |
JIANG Z T, QIN J Q, ZHANG S Q. Parameterized pooling convolution neural network for image classification[J]. Acta Electronica Sinica, 2020, 48(9): 1729-1734. (in Chinese) | |
3 | 李宝奇, 贺昱曜, 强伟, 等. 基于并行附加特征提取网络的SSD地面小目标检测模型[J]. 电子学报, 2020, 48(1): 84-91. |
LI B Q, HE Y Y, QIANG W, et al. SSD with parallel additional feature extraction network for ground small target detection[J]. Acta Electronica Sinica, 2020, 48(1): 84-91. (in Chinese) | |
4 | 罗会兰, 陈鸿坤. 基于深度学习的目标检测研究综述[J]. 电子学报, 2020, 48(6): 1230-1239. |
LUO H L, CHEN H K. Survey of object detection based on deep learning[J]. Acta Electronica Sinica, 2020, 48(6): 1230-1239. (in Chinese) | |
5 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2014-09-04). . |
6 | HAN S, POOL J, TRAN J, et al. Learning both weights and connections for efficient neural networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1. Cambridge, MA, USA: MIT Press, 2015. 1135-1143. |
7 | YE H J, LU S, ZHAN D C. Distilling cross-task knowledge via relationship matching[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 12393-12402. |
8 | CAI Y H, YAO Z W, DONG Z, et al. ZeroQ: A novel zero shot quantization framework[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 13166-13175. |
9 | DENTON E L, ZAREMBA W, Bruna J, et al. Exploiting linear structure within convolutional networks for efficient evaluation[C]//Proceedings of the Advances in Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 2014: 1269-1277. |
10 | ZHANG D Q, YANG J L, YE D, et al. LQ-Nets: Learned quantization for highly accurate and compact deep neural networks[C]//European Conference on Computer Vision. Cham: Springer, 2018: 373-390. |
11 | CHOUKROUN Y, KRAVCHIK E, YANG F, et al. Low-bit quantization of neural networks for efficient inference[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Piscataway: IEEE, 2019: 3009-3018. |
12 | 刘杰, 葛一凡, 田明, 等. 基于ZYNQ的可重构卷积神经网络加速器[J]. 电子学报, 2021, 49(4): 729-735. |
LIU J, GE Y F, TIAN M, et al. Reconfigurable convolutional network accelerator based on ZYNQ[J]. Acta Electronica Sinica, 2021, 49(4): 729-735. (in Chinese) | |
13 | 蹇强, 张培勇, 王雪洁. 一种可配置的CNN协加速器的FPGA实现方法[J]. 电子学报, 2019, 47(7): 1525-1531. |
JIAN Q, ZHANG P Y, WANG X J. An FPGA implementation method for configurable CNN co-accelerator[J]. Acta Electronica Sinica, 2019, 47(7): 1525-1531. (in Chinese) | |
14 | KRISHNAMOORTHI R. Quantizing deep convolutional networks for efficient inference: A whitepaper[EB/OL]. (2018-06-21). . |
15 | FANG J, SHAFIEE A, ABDEL-AZIZ H, et al. Post-training Piecewise Linear Quantization for Deep Neural Networks[C]//European Conference on Computer Vision. Cham: Springer, 2020: 69-86. |
16 | ELHOUSHI M, CHEN Z H, SHAFIQ F, et al. DeepShift: towards multiplication-less neural networks[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE, 2021: 2359-2368. |
17 | NAGEL M, BAALEN M V, BLANKEVOORT T, et al. Data-free quantization through weight equalization and bias correction[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2019: 1325-1334. |
18 | QIN H T, GONG R H, LIU X L, et al. Forward and backward information retention for accurate binary neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 2247-2256. |
19 | IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France: JMLR.org, 2015: 448-456. |
20 | DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2009: 248-255. |
21 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
22 | SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2818-2826. |
23 | SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. |
24 | DING X H, ZHANG X Y, MA N N, et al. RepVGG: making VGG-style ConvNets great again[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2021: 13728-13737. |
25 | TAN Mingxing, LE Quoc V. EfficientNet: Rethinking model scaling for convolutional neural networks[EB/OL]. (2019-05-28).. |
26 | HAN D, YUN S, HEO B, et al. Rethinking channel dimensions for efficient model design[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2021: 732-741. |
[1] | 吕杭, 蒋明峰, 李杨, 张鞠成, 王志康. 基于混合时频域特征的卷积神经网络心律失常分类方法的研究[J]. 电子学报, 2023, 51(3): 701-711. |
[2] | 唐维伟, 钟胜, 卢金仪, 颜露新, 谭富中, 邹旭, 徐文辉. 基于FPGA的Skynet网络结构优化及高时效实现[J]. 电子学报, 2023, 51(2): 314-323. |
[3] | 许新征, 李杉. 基于特征膨胀卷积模块的轻量化技术研究[J]. 电子学报, 2023, 51(2): 355-364. |
[4] | 丁琪, 田萱, 孙国栋. 基于注意力增强的热点感知新闻推荐模型[J]. 电子学报, 2023, 51(1): 93-104. |
[5] | 张永梅, 孙捷. 基于动静态特征双输入神经网络的咳嗽声诊断COVID-19算法[J]. 电子学报, 2023, 51(1): 202-212. |
[6] | 袁海英, 成君鹏, 曾智勇, 武延瑞. Mobile_BLNet:基于Big-Little Net的轻量级卷积神经网络优化设计[J]. 电子学报, 2023, 51(1): 180-191. |
[7] | 吴靖, 叶晓晶, 黄峰, 陈丽琼, 王志锋, 刘文犀. 基于深度学习的单帧图像超分辨率重建综述[J]. 电子学报, 2022, 50(9): 2265-2294. |
[8] | 毛国君, 王者浩, 黄山, 王翔. 基于剪边策略的图残差卷积深层网络模型[J]. 电子学报, 2022, 50(9): 2205-2214. |
[9] | 袁海英, 曾智勇, 成君鹏. 面向灵活并行度的稀疏卷积神经网络加速器[J]. 电子学报, 2022, 50(8): 1811-1818. |
[10] | 张志文, 刘天歌, 聂鹏举. 基于实景数据增强和双路径融合网络的实时街景语义分割算法[J]. 电子学报, 2022, 50(7): 1609-1620. |
[11] | 王相海, 赵晓阳, 王鑫莹, 赵克云, 宋传鸣. 非抽取小波边缘学习深度残差网络的单幅图像超分辨率重建[J]. 电子学报, 2022, 50(7): 1753-1765. |
[12] | 张波, 陆云杰, 秦东明, 邹国建. 一种卷积自编码深度学习的空气污染多站点联合预测模型[J]. 电子学报, 2022, 50(6): 1410-1427. |
[13] | 丁毅, 沈薇, 李海生, 钟琼慧, 田明宇, 李洁. 面向CNN的区块链可信隐私服务计算模型[J]. 电子学报, 2022, 50(6): 1399-1409. |
[14] | 张云, 化青龙, 姜义成, 徐丹. 基于混合型复数域卷积神经网络的三维转动舰船目标识别[J]. 电子学报, 2022, 50(5): 1042-1049. |
[15] | 田春生, 陈雷, 王源, 王硕, 周婧, 张瑶伟, 庞永江, 周冲, 马筱婧, 杜忠, 薛钰. 面向FPGA的布局与布线技术研究综述[J]. 电子学报, 2022, 50(5): 1243-1254. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||