电子学报 ›› 2018, Vol. 46 ›› Issue (8): 1960-1968.DOI: 10.3969/j.issn.0372-2112.2018.08.022

• 学术论文 • 上一篇    下一篇

IBPU:一种面向通用处理器架构的比特置换功能单元

马超1, 南龙梅2, 潘达杉1, 李伟3, 戴紫彬3   

  1. 1. 国家高性能集成电路(上海)设计中心, 上海 201204;
    2. 复旦大学集成电路国家重点实验室, 上海 200433;
    3. 信息工程大学, 河南郑州 450000
  • 收稿日期:2016-07-21 修回日期:2017-11-14 出版日期:2018-08-25
    • 通讯作者:
    • 南龙梅
    • 作者简介:
    • 马超 男,1988年生于陕西西安.国家高性能集成电路(上海)设计中心,博士.研究方向为专用处理器设计,多级动态互连网络,ASIC专用芯片设计.
    • 基金资助:
    • 国家自然科学基金 (No.61404175)

IBPU:A Bit Permutation Functional Unit for General-Purpose Processors

MA Chao1, NAN Long-mei2, PAN Da-shan1, LI Wei3, DAI Zi-bin3   

  1. 1. National High Performance Integrated Circuit Design Center, Shanghai 201204, China;
    2. State Key Lab of ASIC and System, Fudan University, Shanghai 200433, China;
    3. Information Engineering University, Zhengzhou, Henan 450000, China
  • Received:2016-07-21 Revised:2017-11-14 Online:2018-08-25 Published:2018-08-25
    • Corresponding author:
    • NAN Long-mei
    • Supported by:
    • National Natural Science Foundation of China (No.61404175)

摘要: 本文利用Inverse Butterfly网络拓扑结构的自路由特性,并结合分治策略,提出了一种能够硬件高速实现任意比特置的换选路算法.利用该算法能够在O(lgN)条指令内完成N-bit任意静态置换操作,在O(lg2N)条指令内完成N-bit任意动态置换操作.在此基础上,本文构造了一种新型比特置换单元-Permutation Unit based on Inverse Butterfly,IBPU.并将它在SMIC 65nm工艺下进行了逻辑综合,结果表明:与以往研究成果相比,本文提出的IBPU资源消耗降低了约32%,延迟降低了近30%.当完成静态置换操作时,其功能单元所消耗的代价最小,不超过以往设计的60%;当完成动态置换操作时,虽然消耗的代价较大,但其随置换位宽N的增加涨幅较小,因此具有较高的稳定性,其综合性能优势明显.

关键词: Inverse Butterfly网络, 分治策略, 置换选路算法, 硬件实现

Abstract: In this paper,a new routing algorithm for arbitrary bit permutation operations is proposed combining with the divide and conquer strategy.The algorithm utilizes self-routing characteristics of the Inverse Butterfly Network.It can complete any N-bit fixed permutation in no more than O(lgN) instructions,and also can complete any N-bit dynamic permutation in no more than O(lg<i>2N) instructions.On this basis,a new bit-permutation unit based on Inverse Butterfly,IBPU is developed and synthesized in SMIC 65-nm process.The results show that our IBPU has less resource consumption which decreased by about 32%,and lower latency which reduced by nearly 30% compared with the similar designs.Moreover,when it performs fixed permutation,the cost of the functional unit is minimal,which is not more than 60% of what was previously designed.When it performs dynamic permutation,though its cost is greater,the cost has smaller increase accompanying with the increase of permutation width N,so it has higher stability and its comprehensive performance advantages are obvious.

Key words: Inverse Butterfly Network, divide and conquer, permutation routing algorithm, hardware implementation

中图分类号: