信息工程大学,河南郑州 450001
[ "付秋兴 男,1999年8月出生于河南省商丘市.现为信息工程大学网络空间安全专业博士研究生.主要研究方向为全同态密码处理器设计. E-mail: 1415505333@qq.com" ]
[ "李伟 男,1983年11月出生于天津市.现为信息工程大学教授.主要研究方向为体系结构、安全芯片设计、集成电路技术. E-mail: try_1118@163.com" ]
收稿:2024-08-28,
修回:2025-03-10,
纸质出版:2025-04-25
移动端阅览
付秋兴, 李伟, 别梦妮, 等. 格基后量子密码的可重构NTT运算单元与高效调度算法研究[J]. 电子学报, 2025, 53(04): 1182-1191.
FU Qiu-xing, LI Wei, BIE Meng-ni, et al. Research on Reconfigurable NTT Arithmetic Unit and Efficient Scheduling Algorithm for Lattice Post-Quantum Cryptography[J]. Acta Electronica Sinica, 2025, 53(04): 1182-1191.
付秋兴, 李伟, 别梦妮, 等. 格基后量子密码的可重构NTT运算单元与高效调度算法研究[J]. 电子学报, 2025, 53(04): 1182-1191. DOI:10.12263/DZXB.20240788
FU Qiu-xing, LI Wei, BIE Meng-ni, et al. Research on Reconfigurable NTT Arithmetic Unit and Efficient Scheduling Algorithm for Lattice Post-Quantum Cryptography[J]. Acta Electronica Sinica, 2025, 53(04): 1182-1191. DOI:10.12263/DZXB.20240788
为进一步提高格基后量子密码算法中多项式乘法的运算速率,同时考虑到不同格基密码中多项式乘法参数各异的现状,本文提出了一种面向高速的可重构数论变换(Number Theoretic Transforms,NTT)运算单元,并提出了相应的数据调度方案解决时序冲突和空间冲突问题.本文首先分析了不同格基后量子密码算法中NTT算法的运算特征,提出一款4×4的可重构运算单元,满足不同位宽的基2/3/4-NTT运算需求.其次,基于上述硬件设计提出了一种针对基4-NTT算法的数据调度方案,解决了高并行多流水级设计下的时序冲突问题.最后,提出了基于m-着色算法的多Bank数据存储方案,解决数据访问冲突的问题.实验结果表明,本文设计的硬件结构具备实现基2/3/4-NTT及其逆运算功能,能够支持Kyber、Dilithium在内的多种格基后量子密码算法,硬件支持最大并行度为4.为进一步验证本文硬件设计的优越性,在Xilinx Virtex-7器件上进行实验验证,工作频率达169 MHz,可在0.40 μs内完成NTT算法功能,ATP降低约42%;在40 nm CMOS工艺节点进行综合实现,与现有的设计相比,本文的硬件设计AT积降低18%~90%.
In order to further improve the rate of polynomial multiplication in lattices post-quantum cryptography
and considering the different parameters of polynomial multiplication in different lattices
a high-speed reconfigurable number theory transformation (NTT) arithmetic unit is proposed in this paper
and the corresponding data scheduling scheme is proposed to solve the problem of time sequence conflict and space conflict. In this paper
we first analyze the operation characteristics of NTT algorithm in different lattice-based post-quantum cryptography algorithms
and propose a 4×4 reconfigurable operating unit to meet the needs of 2/3/4-NTT operation in different bit widths. Secondly
based on the above hardware design
a data scheduling scheme based on the basic 4-NTT algorithm is proposed to solve the timing conflict problem in the highly parallel multi-pipeline-level design. Finally
a multi-bank data storage scheme based on m-coloring algorithm is proposed to solve the problem of data access conflict. Experimental results show that the hardware structure designed in this paper is capable of implementing base 2/3/4-NTT and its inverse operation functions
and can support a variety of latry-based post-quantum cryptography algorithms including Kyber and Dilithium. The maximum parallelism degree supported by the hardware is 4. In order to further verify the superiority of the hardware design in this paper
Xilinx Virtex-7 device is used for experimental verification. The working frequency is up to 169 MHz
and the NTT algorithm function can be completed within 0.40 μs
and ATP is reduced by about 42%. Integrated implementation on 40 nm CMOS process nodes results in a 18%~90% reduction in the AT volume of the hardware design compared with existing designs.
LIU D S , ZHAO W , LIU Z , et al . Reconfigurable hardware design of multi-lanes number theoretic transform for lattice-based cryptography [J ] . Electronics and Informatics , 2022 , 44 ( 2 ): 566 - 572 .
KUANG H L , ZHAO Y F , HAN J . A high-speed NTT-based polynomial multiplication accelerator with vector extension of RISC-V for saber algorithm [C ] // 2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) . Piscataway : IEEE , 2022 : 592 - 595 .
MONDAL S , PATKAR S , PAL T K . A configurable and efficient implementation of Number Theoretic Transform (NTT) for lattice based Post-Quantum-Cryptography [C ] // 2022 IEEE 7th International Conference for Convergence in Technology (I2CT) . Piscataway : IEEE , 2022 : 1 - 6 .
YE Z W , CHEUNG R C C , HUANG K J . PipeNTT: A pipelined number theoretic transform architecture [J ] . IEEE Transactions on Circuits and Systems II: Express Briefs , 2022 , 69 ( 10 ): 4068 - 4072 .
XIN G Z , HAN J , YIN T Y , et al . VPQC: A domain-specific vector processor for post-quantum cryptography based on RISC-V architecture [J ] . IEEE Transactions on Circuits and Systems I: Regular Papers , 2020 , 67 ( 8 ): 2672 - 2684 .
MU J N , REN Y , WANG W , et al . Scalable and conflict-free NTT hardware accelerator design: Methodology, proof, and implementation [J ] . IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , 2023 , 42 ( 5 ): 1504 - 1517 .
CHUNG C M , HWANG V , KANNWISCHER M J , et al . NTT multiplication for NTT-unfriendly rings [J ] . IACR Transactions on Cryptographic Hardware and Embedded Systems , 2021 : 159 - 188 .
BANERJEE U , UKYAB T S , CHANDRAKASAN A P . Sapphire: A configurable crypto-processor for post-quantum lattice-based protocols [EB/OL ] . ( 2019-10-05 )[ 2025-04-27 ] . https://arxiv.org/abs/1910.07557v2 https://arxiv.org/abs/1910.07557v2 .
CHEN X R , YANG B H , YIN S Y , et al . CFNTT: Scalable radix-2/4 NTT multiplication architecture with an efficient conflict-free memory mapping scheme [J ] . IACR Transactions on Cryptographic Hardware and Embedded Systems , 2021 , 1 : 94 - 126 .
ZHAO Y F , XIE R Q , XIN G Z , et al . A high-performance domain-specific processor with matrix extension of RISC-V for module-LWE applications [J ] . IEEE Transactions on Circuits and Systems I: Regular Papers , 2022 , 69 ( 7 ): 2871 - 2884 .
SHIMADA T , IKEDA M . High-throughput polynomial multiplier architecture for lattice-based cryptography [C ] // 2021 IEEE International Symposium on Circuits and Systems (ISCAS) . Piscataway : IEEE , 2021 : 1 - 5 .
0
浏览量
14
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621