电子学报 ›› 2012, Vol. 40 ›› Issue (2): 384-388.DOI: 10.3969/j.issn.0372-2112.2012.02.028

• 科研通信 • 上一篇    下一篇

可重构硬件芯片级故障定位与自主修复方法

郝国锋, 王友仁, 张砦, 袁鹏, 孔德明   

  1. 南京航空航天大学自动化学院,江苏南京 210016
  • 收稿日期:2011-02-21 修回日期:2011-07-29 出版日期:2012-02-25
    • 基金资助:
    • 国家自然科学基金 (No.60871009); 航空科学基金 (No.2009ZD52045); 南京航空航天大学基本科研业务费专项科研项目 (No.NS2010086)

In-Chip Fault Localization and Self-Repairing Method for Reconfigurable Hardware

HAO Guo-feng, WANG You-ren, ZHANG Zhai, YUAN Peng, KONG De-ming   

  1. College of Automation Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing,Jiangsu 210016,China
  • Received:2011-02-21 Revised:2011-07-29 Online:2012-02-25 Published:2012-02-25

摘要: 外部集中控制的可重构硬件容错系统,其重构控制算法复杂、重构时间开销大,且存在单点失效问题.本文研究芯片级分布式在线自主容错技术,提出了能够实现芯片级自修复的新型可重构硬件细胞阵列结构,阐述了互连资源的在线故障定位和自主修复方法.设计了功能细胞电路和容错开关块电路,采用分段定位法检测互连资源中多路器故障和连线开路故障,通过重配置布线和线移位操作分别实现多路器与连线故障自修复.以4位串并乘法器电路为例进行实验验证,分析了容错设计的硬件开销与时间开销,实验结果表明新方案的容错时间短、资源利用率高.

关键词: 可重构硬件, 芯片级容错, 分布式控制, 故障定位, 自主修复

Abstract: Fault-tolerant system of Reconfigurable Hardware (RH) with centralized controller has the shortcomings of complex reconfiguration algorithm and long reconstruction time,and the system will be invalidation when the controller is in fail.To realize online distributed fault-tolerant in-chip,a new RH architecture of cell arrays was proposed,which has the ability to achieve in-chip self-repairing.The method of fault localization and self-repairing for interconnection circuit between electronic cells in RH are described in detail.The electronic cell circuit and a Fault-Tolerant Switch Block (FT-SB) in RH are designed.The fault-tolerant method of interconnection circuit includes two stages.Firstly,the MUXs in FT-SB and the connection lines between FT-SBs in the fault channels are detected,then the re-routing and line-shift methods in interconnection circuit are introduced to heal the fault MUXs and the broken lines respectively.The implementation and simulation experiment of a 4-bit serial-parallel multiplier are presented.The performance analysis of fault-tolerant time and hardware resources consumption show that the fault-tolerant performance of the interconnection circuits in new RH is improved greatly.

Key words: reconfigurable hardware, in-chip fault-tolerant, distributed control, fault localization, self-repairing

中图分类号: