FU Qiu-xing, LI Wei, BIE Meng-ni, et al. Research on Reconfigurable NTT Arithmetic Unit and Efficient Scheduling Algorithm for Lattice Post-Quantum Cryptography[J]. Acta Electronica Sinica, 2025, 53(04): 1182-1191.
FU Qiu-xing, LI Wei, BIE Meng-ni, et al. Research on Reconfigurable NTT Arithmetic Unit and Efficient Scheduling Algorithm for Lattice Post-Quantum Cryptography[J]. Acta Electronica Sinica, 2025, 53(04): 1182-1191. DOI:10.12263/DZXB.20240788
Research on Reconfigurable NTT Arithmetic Unit and Efficient Scheduling Algorithm for Lattice Post-Quantum Cryptography
In order to further improve the rate of polynomial multiplication in lattices post-quantum cryptography
and considering the different parameters of polynomial multiplication in different lattices
a high-speed reconfigurable number theory transformation (NTT) arithmetic unit is proposed in this paper
and the corresponding data scheduling scheme is proposed to solve the problem of time sequence conflict and space conflict. In this paper
we first analyze the operation characteristics of NTT algorithm in different lattice-based post-quantum cryptography algorithms
and propose a 4×4 reconfigurable operating unit to meet the needs of 2/3/4-NTT operation in different bit widths. Secondly
based on the above hardware design
a data scheduling scheme based on the basic 4-NTT algorithm is proposed to solve the timing conflict problem in the highly parallel multi-pipeline-level design. Finally
a multi-bank data storage scheme based on m-coloring algorithm is proposed to solve the problem of data access conflict. Experimental results show that the hardware structure designed in this paper is capable of implementing base 2/3/4-NTT and its inverse operation functions
and can support a variety of latry-based post-quantum cryptography algorithms including Kyber and Dilithium. The maximum parallelism degree supported by the hardware is 4. In order to further verify the superiority of the hardware design in this paper
Xilinx Virtex-7 device is used for experimental verification. The working frequency is up to 169 MHz
and the NTT algorithm function can be completed within 0.40 μs
and ATP is reduced by about 42%. Integrated implementation on 40 nm CMOS process nodes results in a 18%~90% reduction in the AT volume of the hardware design compared with existing designs.
关键词
Keywords
references
LIU D S , ZHAO W , LIU Z , et al . Reconfigurable hardware design of multi-lanes number theoretic transform for lattice-based cryptography [J ] . Electronics and Informatics , 2022 , 44 ( 2 ): 566 - 572 .
KUANG H L , ZHAO Y F , HAN J . A high-speed NTT-based polynomial multiplication accelerator with vector extension of RISC-V for saber algorithm [C ] // 2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) . Piscataway : IEEE , 2022 : 592 - 595 .
MONDAL S , PATKAR S , PAL T K . A configurable and efficient implementation of Number Theoretic Transform (NTT) for lattice based Post-Quantum-Cryptography [C ] // 2022 IEEE 7th International Conference for Convergence in Technology (I2CT) . Piscataway : IEEE , 2022 : 1 - 6 .
YE Z W , CHEUNG R C C , HUANG K J . PipeNTT: A pipelined number theoretic transform architecture [J ] . IEEE Transactions on Circuits and Systems II: Express Briefs , 2022 , 69 ( 10 ): 4068 - 4072 .
XIN G Z , HAN J , YIN T Y , et al . VPQC: A domain-specific vector processor for post-quantum cryptography based on RISC-V architecture [J ] . IEEE Transactions on Circuits and Systems I: Regular Papers , 2020 , 67 ( 8 ): 2672 - 2684 .
MU J N , REN Y , WANG W , et al . Scalable and conflict-free NTT hardware accelerator design: Methodology, proof, and implementation [J ] . IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , 2023 , 42 ( 5 ): 1504 - 1517 .
CHUNG C M , HWANG V , KANNWISCHER M J , et al . NTT multiplication for NTT-unfriendly rings [J ] . IACR Transactions on Cryptographic Hardware and Embedded Systems , 2021 : 159 - 188 .
BANERJEE U , UKYAB T S , CHANDRAKASAN A P . Sapphire: A configurable crypto-processor for post-quantum lattice-based protocols [EB/OL ] . ( 2019-10-05 )[ 2025-04-27 ] . https://arxiv.org/abs/1910.07557v2 https://arxiv.org/abs/1910.07557v2 .
CHEN X R , YANG B H , YIN S Y , et al . CFNTT: Scalable radix-2/4 NTT multiplication architecture with an efficient conflict-free memory mapping scheme [J ] . IACR Transactions on Cryptographic Hardware and Embedded Systems , 2021 , 1 : 94 - 126 .
ZHAO Y F , XIE R Q , XIN G Z , et al . A high-performance domain-specific processor with matrix extension of RISC-V for module-LWE applications [J ] . IEEE Transactions on Circuits and Systems I: Regular Papers , 2022 , 69 ( 7 ): 2871 - 2884 .
SHIMADA T , IKEDA M . High-throughput polynomial multiplier architecture for lattice-based cryptography [C ] // 2021 IEEE International Symposium on Circuits and Systems (ISCAS) . Piscataway : IEEE , 2021 : 1 - 5 .