Network Monitor Based Dynamic Fault-Tolerant Routing for Application-Specific Network on Chip
GE Fen1, WU Ning1, QIN Xiao-lin2, ZHANG Ying1, ZHOU Fang1
1. College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, Jiangsu 210016, China; 2. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, Jiangsu 210016, China
摘要 针对专用片上网络(Network on Chip,NoC)全局通信事务管理和可靠性设计问题,提出片上网络监控器的概念,用于获取全局网络实时状态信息及执行路径分配算法,基于此提出一种动态路由机制DyRS-NM.该机制能检测和定位NoC中的拥塞和故障链路,并能区分瞬时和永久性链路故障,采用重传方式避免瞬时故障,通过重新路由计算绕开拥塞和永久性故障.设计实现了RTL级网络监控器和与之通信的容错路由器模块,并将MPEG4解码器应用映射至基于网络监控器的4×4Mesh结构NoC体系结构中,验证了系统性能以及面积功耗开销.相比静态XY路由和容错动态路由FADR,DyRS-NM机制在可接受的开销代价下获得了更优的性能.
Abstract:Aiming at the problem of global communication management and reliability design in application-specific Network on Chip (NoC),the concept of network monitor is introduced,which monitors overall network real-time conditions and implements path allocation algorithm.Then a novel dynamic routing scheme based on Network Monitor called DyRS-NM is presented.The proposed scheme has the ability to discover and deal with both congestion and permanent faults and distinguish them from transient faults.DyRS-NM can avoid transient faults by using retransmission scheme,and also can detour congested and permanently faulted links by recalculating routing paths.The RTL-level circuits design of the network monitor and fault-tolerant router are realized,and experimental results with a MPEG4 decoder application mapped onto network monitor based 4×4 mesh NoC architecture verify the system performance and cost of area and power consumption.Compared to both static XY routing and fault-aware dynamic routing FADR,significant performance improvements can be achieved by using the DyRS-NM scheme with acceptable additional cost.
[1] 杨盛光,李丽,高明伦,等.面向能耗和延时的NoC映射方法[J].电子学报,2008,36(5):937-942. Yang Sheng-guang,Li Li,Gao Ming-lun,et al.An energy and delay-aware mapping method of NoC[J].Acta Electronica Sinica,2008,36(5):937-942.(in Chinese) [2] Hu J,Marculescu R.DyAD-smart routing for networks-on-chip[A].Proc.DAC[C].San Diego,CA:ACM Press,2004.260-233. [3] Li M,Zeng Q A,Jone W B.DyXY-a proximity congestion-aware deadlock-free dynamic routing method for network on chip[A].Proc.DAC[C].San Francisco,CA:ACM Press,2006.849-852. [4] Gratz P,Grot B,Keckler S W.Regional congestion awareness for load balance in networks-on-chip[A].Proc.HPCA[C].Salt Lake City,UT:IEEE,2008.203-214. [5] Ramanujam R S,Lin B.Destination-based adaptive routing on 2D mesh networks[A].Proc.ANCS[C].San Diego,CA :IEEE,2010.1-12. [6] Tedesco L,Clermidy F,Moraes F.A monitoring and adaptive routing mechanism for QoS traffic on mesh NoC architectures[A].Proc.CODES+ISSS[C].Grenoble,France:ACM Press,2009.109-118. [7] Dumitras T,Kerner S,Marculescu R.Towards on-chip fault-tolerant communication[A].Proc.ASP-DAC[C].Kitakyushu,Japan:IEEE,2003.225-232. [8] Murali S,Theocharides T,Vijaykrishnan N,et al.Analysis of error recovery schemes for networks on chips[J].IEEE Design&Test of Computers,2005,22(5):434-442. [9] Greenfield D,Banerjee A,Lee J G,et al.Implications of rent's rule for noc design and its fault-tolerance[A].Proc.NOCS[C].Princeton,NJ:IEEE,2007.283-294. [10] Ali M,Welzl M,Hessler S.A fault tolerant mechanism for handling permanent and transient failures in a network on chip[A].Proc.ITNG[C].Las Vegas,NV:IEEE,2007.1027-1032. [11] Rameshan N,Laxmi V,Gaur M S,et al.Minimal path,fault tolerant,QoS aware routing with node and link failures in 2-D mesh NoC[A].Proc.DFT[C].Kyoto,Japan:IEEE,2010.60-66. [12] Zou Y,Pasricha S.NARCO:neighbor aware turn model-based fault tolerant routing for NoCs[J].IEEE Embedded Systems Letters,2010,2):85-89. [13] Hosseini A,Ragheb T,Massoud Y.A fault-aware dynamic routing algorithm for on-chip networks[A].Proc.ISCAS[C].Seattle,WA:IEEE,2008.2653-2656. [14] 付斌章,韩银和,李华维,等.面向高可靠片上网络通信的可重构路由算法[J].计算机辅助设计与图形学学报,2011,23):448-455. Fu Bin-zhang,Han Yin-he,Li Hua-wei,et al.Building resilient NoC with a reconfigurable routing algorithm[J].Journal of Computer-Aided Design & Computer Graphics,2011,23):448-455.(in Chinese) [15] Ho C,Stockmeyer L.A new approach to fault-tolerant wormhole routing for mesh-connected parallel computers[J].IEEE Transactions on Computers,2004,54):427-439. [16] Fu B,Han Y,Li H,et al.A new multiple-round dimension-order routing for networks-on-chip[J].IEICE Transactions on Information and Systems,2011,E94-D(4):809-821. [17] Mejia A,Flich J,Duato J,et al.Segment based routing:an efficient fault-tolerant routing algorithm for meshes and tori[A].Proc.IPDPS[C].Rhodes Island,Greece:IEEE,2006.84-93. [18] Ge Fen,Wu Ning.Genetic algorithm based mapping and routing approach for network on chip architectures[J].Chinese Journal of Electronics,2010,19(1):91-96. [19] Jesshope C R,Miller P R,Yantchev J T.High performance communications in processor network[A].Proc.ISCA[C].Jerusalem,Israel:IEEE,1989.150-157. [20] Dick R P,Rhodes D L,Wolf W.TGFF:task graphs for free[A].Proc.CODES[C].Seattle,Washington:IEEE,1998.97-101.