1. 国家高性能集成电路(上海)设计中心,上海,201204
2. 复旦大学集成电路国家重点实验室,上海,200433
3. 信息工程大学,河南,郑州,450000
4. 国家高性能集成电路(上海)设计中心,上海,201204
5. 复旦大学集成电路国家重点实验室,上海,200433
6. 信息工程大学,河南,郑州,450000
纸质出版:2018
移动端阅览
马超, 南龙梅, 潘达杉, 等. IBPU:一种面向通用处理器架构的比特置换功能单元[J]. 电子学报, 2018,46(8):1960-1968.
MA Chao, NAN Long-mei, PAN Da-shan, et al. IBPU:A Bit Permutation Functional Unit for General-Purpose Processors[J]. Acta Electronica Sinica, 2018, 46(8): 1960-1968.
马超, 南龙梅, 潘达杉, 等. IBPU:一种面向通用处理器架构的比特置换功能单元[J]. 电子学报, 2018,46(8):1960-1968. DOI: 10.3969/j.issn.0372-2112.2018.08.022.
MA Chao, NAN Long-mei, PAN Da-shan, et al. IBPU:A Bit Permutation Functional Unit for General-Purpose Processors[J]. Acta Electronica Sinica, 2018, 46(8): 1960-1968. DOI: 10.3969/j.issn.0372-2112.2018.08.022.
本文利用Inverse Butterfly网络拓扑结构的自路由特性,并结合分治策略,提出了一种能够硬件高速实现任意比特置的换选路算法.利用该算法能够在
O
(lg
N
)条指令内完成
N
-bit任意静态置换操作,在
O
(lg
2
N
)条指令内完成
N
-bit任意动态置换操作.在此基础上,本文构造了一种新型比特置换单元-Permutation Unit based on Inverse Butterfly,IBPU.并将它在SMIC 65nm工艺下进行了逻辑综合,结果表明:与以往研究成果相比,本文提出的IBPU资源消耗降低了约32%,延迟降低了近30%.当完成静态置换操作时,其功能单元所消耗的代价最小,不超过以往设计的60%;当完成动态置换操作时,虽然消耗的代价较大,但其随置换位宽N的增加涨幅较小,因此具有较高的稳定性,其综合性能优势明显.
In this paper
a new routing algorithm for arbitrary bit permutation operations is proposed combining with the divide and conquer strategy.The algorithm utilizes self-routing characteristics of the Inverse Butterfly Network.It can complete any
N
-bit fixed permutation in no more than
O
(lg
N
) instructions
and also can complete any
N
-bit dynamic permutation in no more than
O
(lg
i2N) instructions.On this basis
a new bit-permutation unit based on Inverse Butterfly
IBPU is developed and synthesized in SMIC 65-nm process.The results s
how that our IBPU has less resource consumption which decreased by about 32%
and lower latency which reduced by nearly 30% compared with the similar designs.Moreover
when it performs fixed permutation
the cost of the functional unit is minimal
which is not more than 60% of what was previously designed.When it performs dynamic permutation
though its cost is greater
the cost has smaller increase accompanying with the increase of permutation width N
so it has higher stability and its comprehensive performance advantages are obvious.
0
浏览量
1007
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621