[1] Xiao C,Casseau E,Wang S,et al.Automatic custom instruction identification for application-specific instruction set processors[J].Microprocessors & Microsystems,2014,38(8):1012-1024.
[2] Galuzzi C,Bertels K.The instruction-set extension problem:a survey[J].ACM Transactions on Reconfigurable Technology & Systems,2011,4(2):1-28.
[3] P Faraboschi et al.Lx:A technology platform for customizable VLIW embedded processing[A].Proceedings of International Symposium on Computer Architecture [C].2000.203-212.
[4] R E Gonzalez.Xtensa:A configurable and extensible processor[J].IEEE Micro,2000,20(2):60-70.
[5] Z A Ye.et al.Chimaera:A high-performance architecture with a tightly-coupled reconfigurable functional unit [A].2000 International Symposium on Computer Architecture [C].Vancouver:IEEE,2000.225-234.
[6] 陈虎,陈书明,陈胜刚,等.GISEES:面向嵌入式系统的扩展指令集自动产生方法[J].电子学报,2011,39(9):2026-2033. Chen Hu,Chen Shu-ming,Chen Shen-gang,et al.GISSES:automatic generation of instruction-set extensions for embedded systems[J].Acta Electronica Sinica,2011,39(9):2026-2033.(in Chinese)
[7] Sartor J B,Eeckhout L.MInGLE:An efficient framework for domain acceleration using low-power specialized functional units[J].Acm Transactions on Architecture & Code Optimization,2016,13(2):17.
[8] Wang C,Li X,Zhang H,et al.Hot spots profiling and dataflow analysis in custom dataflow computing SoftProcessors[J].Journal of Systems and Software,2017,125:427-438.
[9] Clark N,Hormati A,Mahlke S,et al.Scalable subgraph mapping for acyclic computation accelerators [A].2006 International Conference on Compilers,Architecture and Synthesis for Embedded Systems [C].Seoul:ACM,2006.147-157.
[10] Cong J,Fan Y,Han G,et al.Application-specific instruction generation for configurable processor architectures[A].2004 ACM International Symposium on Field-Programmable Gate Arrays[C].Monterey:ACM,2004.183-189.
[11] Galuzzi C,Bertels K.The instruction-set extension problem:a survey[A].2008 International Workshop on Reconfigurable Computing:Architectures,Tools and Applications[C].London:Springer,2008.209-220.
[12] Alippi C,Fornaciari W,Pozzi L,et al.A DAG-based design approach for reconfigurable VLIW processors[A].1999 Conference on Design,Automation and Test in Europe[C].Munich:ACM,1999.57.
[13] Galuzzi C,Bertels K,Vassiliadis S.A linear complexity algorithm for the generation of multiple input single output instructions of variable size[A].2007 International Workshop on Embedded Computer Systems[C].Berlin:Springer,2007.283-293.
[14] Galuzzi C.Automatic selection of application-specific instruction-set extensions[A].2007 International Conference on Hardware/software Codesign and System Synthesis[C].Salzburg:IEEE,2007.160-165.
[15] Pozzi L,Atasu K,Ienne P.Exact and approximate algorithms for the extension of embedded processor instruction sets[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2006,25(7):1209-1229.
[16] Xiao C,Casseau E.Exact custom instruction enumeration for extensible processors[J].Integration,the VLSI journal,2012,45(3):263-270.
[17] Chen X,Maskell D L,Sun Y.Fast identification of custom instructions for extensible processors[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2007,26(2):359-368.
[18] Balister P,Gerke S,Gutin G,et al.Algorithms for generating convex sets in acyclic digraphs[J].Journal of Discrete Algorithms,2009,7(4):509-518.
[19] Wang S,Xiao C,Liu W.A faster algorithm for enumerating connected convex subgraphs in acyclic digraphs[J].IEEE Embedded Systems Letters,2017,9(1):9-12.
[20] Atasu K,Luk W,Mencer O,et al.FISH:Fast instruction synthesis for custom processors[J].IEEE Transactions on Very Large Scale Integration Systems,2011,20(1):52-65.
[21] Giaquinta E,Mishra A,Pozzi L.Maximum convex subgraphs under I/O constraint for automatic identification of custom instructions[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2015,34(3):483-494.
[22] Huynh H P,Sim J E,Mitra T.An efficient framework for dynamic reconfiguration of instruction-set customization[J].Design Automation for Embedded Systems,2009,13(1-2):91-113.
[23] Yu P,Mitra T.Scalable custom instructions identification for instruction-set extensible processors[A].2004 International Conference on Compilers,Architecture,and Synthesis for Embedded Systems[C].Washington:ACM,2004.69-78.
[24] 薄拾,葛宁,林孝康.一种高效的凸连通子图枚举算法[J].软件学报,2010,21(12):3106-3115. Bo Shi,Ge Ning,Lin Xiao-kang.Efficient algorithm for convex connected subgraph enumeration[J].Journal of Software,2010,21(12):3106-3115.(in Chinese)
[25] Yu P,Mitra T.Disjoint pattern enumeration for custom instructions identification[A].2007 International Conference on Field Programmable Logic and Applications[C].Amsterdam:IEEE,2007.273-278.
[26] Xiao C,Casseau E.Efficient custom instruction enumeration for extensible processors[A].2011 IEEE International Conference on Application-Specific Systems,Architectures and Processors[C].Santa Monica:IEEE,2011.211-214.
[27] Prakash A,Clarke C T,Lam S K,et al.Rapid memory-aware selection of hardware accelerators in programmable SoC design[J].IEEE Transactions on Very Large Scale Integration(VLSI)Systems,2017.1-12.
[28] Jordans R,Jó?wiak L,Corporaal H,et al.Automatic instruction-set architecture synthesis for VLIW processor cores in the ASAM project[J].Microprocessors and Microsystems,2017,51:114-133.
[29] Ahn J,Choi K.Isomorphism-aware identification of custom instructions with i/o serialization[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2013,32(1):34-46.
[30] Wang S,Xiao C,Liu W,et al.Selecting most profitable instruction-set extensions using ant colony heuristic[A].2015 Conference on Design and Architectures for Signal and Image Processing[C].Krakow:IEEE,2015.1-7.
[31] Kamal M,Afzali-Kusha A,Safari S,et al.Yield and speedup improvements in extensible processors by allocating extra cycles to some custom instructions[J].ACM Transactions on Design Automation of Electronic Systems,2016,21(2):28.
[32] Xiao C,Wang S,Liu W,et al.Parallel custom instruction identification for extensible processors[J].Journal of Systems Architecture,2017,76(C):149-159.
[33] Severance A,Edwards J,Omidian H,et al.Soft vector processors with streaming pipelines[A].2014 ACM International Symposium on Field-programmable Gate Arrays[C].Monterey:ACM,2014.117-126.
[34] Ham T J,Wu L,Sundaram N,et al.Graphicionado:A high-performance and energy-efficient accelerator for graph analytics[A].2016 International Symposium on Microarchitecture[C].Taipei:IEEE,2016.1-13.
[35] Kulkarni A,Page A,Attaran N,et al.An energy-efficient programmable manycore accelerator for personalized biomedical applications[J].IEEE Transactions on Very Large Scale Integration(VLSI)Systems,2018,26(1):96-109.
[36] González-Álvarez C,Sartor J B,Alvarez C,et al.Automatic design of domain-specific instructions for low-power processors[A].2015 International Conference on Application-specific Systems,Architectures and Processors[C].Toronto:IEEE,2015.1-8.
[37] Mitra T.Handbook of Hardware/Software Codesign[M].Berlin:Springer,2017.377-409.
[38] Xiao C,Casseau E.Improving high-level synthesis effectiveness through custom operator identification[A].2014 IEEE International Symposium on Circuits and Systems[C].Melbourne:IEEE,2014.161-164.
[39] Martin K,Wolinski C,Kuchcinski K,et al.Constraint-driven instructions selection and application scheduling in the DURASE system[A].2009 IEEE International Conference on Application-Specific Systems,Architectures and Processors[C].Boston:IEEE,2009.145-152.
[40] Kastner R,Kaplan A,Memik S O,et al.Instruction generation for hybrid reconfigurable systems[J].ACM Transactions on Design Automation of Electronic Systems(TODAES),2002,7(4):605-627.
[41] Guo Y,Smit G J M,Broersma H,et al.A graph covering algorithm for a coarse grain reconfigurable system[J].ACM SIGPLAN Notices.ACM,2003,38(7):199-208.
[42] Kamal M,Afzali-Kusha A,Safari S,et al.OPLE:A heuristic custom instruction selection algorithm based on partitioning and local exploration of application dataflow graphs[J].ACM Transactions on Embedded Computing Systems,2015,14(4):1-23.
[43] Bozorgzadeh E,Memik S O,Kastner R,et al.Pattern selection:customized block allocation for domain-specific programmable systems[A].2002 International Conference on Engineering of Reconfigurable Systems and Algorithms[C].Las Vegas:IEEE,2002.190-196.
[44] Wang S,Xiao C,Liu W,et al.A comparison of heuristic algorithms for custom instruction selection[J].Microprocessors & Microsystems,2016,45:176-186.
[45] Mishra A,Agarwal M,Asati A R,et al.Using graph isomorphism for mapping of data flow applications on reconfigurable computing systems[J].Microprocessors and Microsystems,2017,51:343-355.
[46] Bukchin Y,Raviv T.Constraint programming for solving various assembly line balancing problems[J].Omega,2017(7):1-32.
[47] Cordella L P,Foggia P,Sansone C,et al.A(sub)graph isomorphism algorithm for matching large graphs[J].IEEE transactions on pattern analysis and machine intelligence,2004,26(10):1367-1372.
[48] Arslan M A,Kuchcinski K.Instructionselection and scheduling for DSP kernels on custom architectures[A].2013 Euromicro Conference on Digital System Design[C].Los Alamitos:IEEE,2013.821-828.
[49] Sisto A,Pilato L,Serventi R,et al.Application specific instruction set processor for sensor conditioning in automotive applications[J].Microprocessors and Microsystems,2016,47:375-384.
[50] A S Eissa,et al.SHA-3 instruction set extension for a 32-bit RISC processor architecture[A].2016 IEEE International Conference on Application-Specific Systems,Architectures and Processors[C].London:IEEE,2016.233-234.
[51] 朱方.基于MPSoC的移动视频监控关键技术研究[D].南京:东南大学,2016.1-124. Zhu Fang.Research on key technology of mobile video surveillance on MPSOC[D].Nanjing:Southeast University,2016.1-124.(in Chinese)
[52] Mori J Y,Kautz F,Hübner M.Efficient camera input system and memory partition for a vision soft-processor[A].2016 International Symposium on Applied Reconfigurable Computing[C].Rio De Janeiro:Springer,2016.328-333.
[53] Rakanovic D,Struharik R.Implementation of application specific instruction-set processor for the artificial neural network acceleration using LISA ADL[A].2017 East-West Design & Test Symposium[C].Novi Sad:IEEE,2017.1-6.
[54] Edwards J,Lemieux G G F.Real-time object detection in software with custom vector instructions and algorithm changes[A].2017 IEEE International Conference on Application-Specific Systems,Architectures and Processors[C].Seattle:IEEE,2017.75-82.
[55] Rawat H K,Schaumont P.SIMD instruction set extensions for keccak with applications to SHA-3,Keyak and Ketje[A].2016 Hardware and Architectural Support for Security and Privacy[C].Seoul:ACM,2016.1-7.
[56] Youssef N B H,Youssef W E H,Machhout M,et al.Instruction set extensions of AES algorithms for 32-bit processors[A].2014 International Carnahan Conference on Security Technology[C].Rome:IEEE,2014.1-5.
[57] 胡绵江,窦勇,倪时策,等.一种面向加密算法共性子图的指令定制方法[J].计算机研究与发展,2012,49(s1):299-304. Hu Jin-jiang,Dou Yong,Ni Shi-ce,et al.A way using common subgraph to customise instruction for encryption algorithms[J].Journal of Computer Research and Development,2012,49(s1):299-304.(in Chinese)
[58] 夏辉,于佳,秦尧,等.嵌入式领域ECC专用指令处理器的研究[J].计算机学报,2017,40(5):1092-1108. Xia Hui,Yu Jia,Qin Rao,et al.The researches on the ASIP of ECC in embedded domain[J].Chinese Journal of Compouter,2017,40(5):1092-1108.(in Chinese)
[59] D Wu,et al.Cloud-based design and manufacturing:A new paradigm in digital manufacturing and design innovation[J].Computer-Aided Design,2015,59:1-14.
[60] Wang S,Xiao C,Liu W.Parallel enumeration of custom instructions based on multi-depth graph partitioning[J].IEEE Embedded Systems Letters,2019,11(1):1-4.
[61] Ananthanarayana T,Lopez S,Lukowiak M.Power analysis of HLS-designed customized instruction set architectures[A].2017 IEEE International Parallel and Distributed Processing Symposium Workshops[C].Orland:IEEE,2017.207-212.
[62] C González-Álvarez,J B Sartor,C Álvarez,D Jiménez-González,L Eeckhout.Accelerating an application domain with specialized functional units[J].ACM Trans.Archit.Code Optim,2013,10(4):1-25.
[63] G Cecilia,et al.MInGLE:An efficient framework for domain acceleration using low-power specialized functional units[J].ACM Trans.on Architecture and Code Optimization,2016,13(2):1-26.
[64] M Karunarathna,Y Tian,C Fidge.Domain-specific application analysis for customized instruction identification[J].Elsevier Microprocessors and Microsystems,2014,38(7):637-648.
[65] A Pulli,C Galuzzi,G Gaydadjiev.Towards domain-specific instruction-set generation[A].2014 International Conference on Field Programmable Logic and Applications[C].Munich:IEEE,2014.1-4.
[66] Wang C,Gong L,Yu Q,et al.DLAU:A scalable deep learning accelerator unit on FPGA[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2016,36(3):513-517. |