1.国防科技大学计算机学院,湖南长沙 410073
2.先进微处理器芯片与系统重点实验室,湖南长沙 410073
[ "黄立波 男,1983年出生于湖南省邵阳市.现为国防科技大学计算机学院研究员.主要研究方向为计算机体系结构.E-mail: libohuang@nudt.edu.cn" ]
[ "杨凌(通讯作者) 男,1996年出生于四川省雷波县.现为国防科技大学计算机学院博士研究生.主要研究方向为计算机体系结构.E-mail: yanglingnudt@nudt.edu.cn" ]
[ "杨乾明 男,1984年出生.现为国防科技大学计算机学院助理研究员.主要研究方向为计算机微体系结构、芯片存储系统设计. E-mail: yqm21249@nudt.edu.cn" ]
[ "马胜 男,1986年出生于湖南省永州市.现为国防科技大学计算机学院副研究员.主要研究方向为计算机体系结构. E-mail: masheng@nudt.edu.cn" ]
[ "王永文 男,1977年出生于山东省泰安市.现为国防科技大学计算机学院研究员.主要研究方向为微处理器体系结构. E-mail: yongwen@nudt.edu.cn" ]
[ "隋兵才 男,1981年出生于山东省烟台市.现为国防科技大学计算机学院副研究员.主要研究方向为微处理器体系结构. E-mail: bingcaisui@nudt.edu.cn" ]
[ "沈立 男,1976年出生于四川省成都市.现为国防科技大学计算机学院教授.主要研究方向为多核/众核体系结构、运行时和编译优化、高性能计算. E-mail: lishen@nudt.edu.cn" ]
[ "徐炜遐 男,1963年出生于湖南省常德市.现为国防科技大学计算机学院研究员.主要研究方向为高性能计算机系统结构.E-mail: xuweixia@nudt.edu.cn" ]
收稿:2023-03-06,
修回:2023-09-14,
纸质出版:2023-12-25
移动端阅览
黄立波,杨凌,杨乾明等.处理器值预测技术研究[J].电子学报,2023,51(12):3591-3618.
HUANG Li-bo,YANG Ling,YANG Qian-ming,et al.Research on Processor Value Prediction[J].ACTA ELECTRONICA SINICA,2023,51(12):3591-3618.
黄立波,杨凌,杨乾明等.处理器值预测技术研究[J].电子学报,2023,51(12):3591-3618. DOI: 10.12263/DZXB.20230206.
HUANG Li-bo,YANG Ling,YANG Qian-ming,et al.Research on Processor Value Prediction[J].ACTA ELECTRONICA SINICA,2023,51(12):3591-3618. DOI: 10.12263/DZXB.20230206.
当今的处理器性能与存储器带宽和延迟严重失衡的问题限制了计算系统的整体性能,而存储器的性能对制程工艺不敏感,在后摩尔时代下很难再通过集成电路制造工艺的迭代获得处理器性能收益,因此人们更多地想通过体系结构的创新获得更高性能的计算系统.处理器值预测技术是一种能在无需改变存储系统情况下有效缓解存储墙问题的解决方案,其通过预测性地打破数据真相关进而让更多的指令可以在乱序处理器中并行执行,而无需等待由于访存等操作造成的长周期指令执行.近年来,值预测在各个方面都有了实质性的进步,但现如今还没有商用处理器使用这一技术,这主要是由于值预测技术的使用还面临许多挑战:现有的处理器的流水线架构不能直接使用值预测技术;值预测所需的预测值传递机制需要额外的硬件资源开销;值预测器巨大的存储开销让其很难在片上实现;由于值预测错误时的性能惩罚大,因此预测准确率较低的值预测器会降低处理器性能.针对这些问题,本文以值预测技术为中心,围绕值预测技术相关的流水线架构、值预测器结构和错误恢复机制三个方面分别详细论述了国内外研究成果以及其对于各个问题挑战的解决策略.最后,本文对当今的处理器值预测技术进行了总结并对未来的研究方向进行了展望.
The extreme imbalance between processor performance and memory bandwidth/latency limits the overall performance of computing systems. In the post-Moore era
it is challenging to obtain processor performance benefits through the iteration of the integrated circuit manufacturing process
and memory performance is not sensitive to the process. Therefore
people tend to obtain higher-performance computing systems through architectural innovation. Processor value prediction technology is a solution that can effectively alleviate the memory wall problem without changing the storage system. By speculatively breaking the true dependency of data
more instructions can be executed in parallel in an out-of-order processor. There is no need to wait for the execution of long-cycle instructions caused by memory access
etc. In recent years
value prediction has made significant progress in various aspects. However
no commercial processors are using this technology
mainly because the development of value prediction technology still faces many challenges: the pipeline architecture of existing processors cannot directly use value prediction techniques; the register file read and write ports required for value prediction are physically challenging to implement; the huge storage overhead of the value predictor makes it difficult to implement on-chip; due to the significant performance loss when the value prediction is wrong
the value predictors with low prediction accuracy will reduce processor performance. In response to these problems
this paper focuses on value prediction technology. It discusses in detail the research at home and abroad and its solutions to problems and challenges around the value prediction pipeline architecture
value predictor structure
and misprediction recovery mechanism related to value prediction technology. Finally
this paper summarizes processor value prediction techniques and provides an outlook on future research directions.
RAJBHANDARI S , RUWASE O , RASLEY J , et al . Zero-infinity: Breaking the GPU memory wall for extreme scale deep learning [C ] // Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis . New York : ACM , 2021 : 1 - 14 .
VAVOULIOTIS G , CHACON G , ALVAREZ L , et al . Page size aware cache prefetching [C ] // 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO) . Piscataway : IEEE , 2022 : 956 - 974 .
LEE S , KANG S H , LEE J , et al . Hardware architecture and software stack for PIM based on commercial DRAM technology: Industrial product [C ] // Proceedings of the 48th Annual International Symposium on Computer Architecture . New York : IEEE , 2021 : 43 - 56 .
KE L , ZHANG X , SO J , et al . Near-memory processing in action: Accelerating personalized recommendation with AxDIMM [J ] . IEEE Micro , 2022 , 42 ( 1 ): 116 - 127 .
JEONG I , LEE J , YOON M K , et al . Reconstructing out-of-order issue queue [C ] // Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture . New York : ACM , 2022 : 144 - 161 .
DOWECK J , KAO W F , LU A K Y , et al . Inside 6th-generation Intel core: New microarchitecture code-named skylake [J ] . IEEE Micro , 2017 , 37 ( 2 ): 52 - 62 .
MATTIOLI M . Meet the FaM1ly [J ] . IEEE Micro , 2022 , 42 ( 3 ): 78 - 84 .
HU W W , ZHANG F X , LI Z S . Microarchitecture of the godson-2 processor [J ] . Journal of Computer Science and Technology , 2005 , 20 ( 2 ): 243 - 249 .
张福新 . 微处理器性能分析与优化 [D ] . 北京 : 中国科学院研究生院(计算技术研究所) , 2005 .
ZHANG F X . Performance Analysis and Optimizations of Microprocessors [D ] . Beijing : Graduate School of Chinese Academy of Sciences (Institute of Computing Technology) , 2005 . (in Chinese)
ZHAO J , KORPAN B , GONZALEZ A , et al . Sonicboom: The 3rd generation berkeley out-of-order machine [EB/OL ] . ( 2020-05-29 )[ 2023-03-01 ] . https://carrv.github.io/2020/papers/CARRV2020_paper_15_Zhao.pdf https://carrv.github.io/2020/papers/CARRV2020_paper_15_Zhao.pdf .
SEZNEC A , MICHAUD P . A case for (partially) TAgged GEometric history length branch prediction [J ] . The Journal of Instruction-Level Parallelism , 2006 , 8 : 23 .
PERAIS A . First championship value prediction (CVP-1) [EB/OL ] . ( 2018-06-03 )[ 2023-03-01 ] . https://microarch.org/cvp1/cvp1online/contestants.html https://microarch.org/cvp1/cvp1online/contestants.html .
LIPASTI M H , SHEN J P . Exceeding the dataflow limit via value prediction [C ] // Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . New York : ACM , 1996 : 226 - 237 .
LIPASTI M H , WILKERSON C B , SHEN J P . Value locality and load value prediction [C ] // Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems . New York : ACM , 1996 : 138 - 147 .
GABBAY F , MENDELSON A . Speculative Execution Based on Value Prediction [R ] . Haifa : Technion-Israel Institute of Technology, Department of Electrical Engineering , 1996 .
SAZEIDES Y , SMITH J E . The predictability of data values [C ] // Proceedings of the 30th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . New York : ACM , 1997 : 248 - 258 .
SEZNEC A . Exploring value prediction with the EVES predictor [C ] // CVP-1 2018-1st Championship Value Prediction . Piscataway : IEEE , 2018 : 1 - 6 .
BANDISHTE S , GAUR J , SPERBER Z , et al . Focused value prediction: Concepts, techniques and implementations presented in this paper are subject matter of pending patent applications, which have been filed by Intel Corporation [C ] // 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) . Piscataway : IEEE , 2020 : 79 - 91 .
KALAITZIDIS K , SEZNEC A . Value speculation through equality prediction [C ] // 2019 IEEE 37th International Conference on Computer Design (ICCD) . Piscataway : IEEE , 2019 : 694 - 697 .
SHEIKH R , HOWER D . Efficient load value prediction using multiple predictors and filters [C ] // 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA) . Piscataway : IEEE , 2019 : 454 - 465 .
ISHII Y . Context-base computational value prediction with value compression [C ] // CVP-1 2018-1st Championship Value Prediction . Piscataway : IEEE , 2018 : 1 .
KALAITZIDIS K , SEZNEC A . Leveraging value equality prediction for value speculation [J ] . ACM Transactions on Architecture and Code Optimization , 2020 , 18 ( 1 ): 1 - 20 .
肖勇 . 值预测技术研究 [D ] . 长沙 : 国防科学技术大学 , 2005 .
XIAO Y . Research on Data Value Prediction [D ] . Changsha : National University of Defense Technology , 2005 . (in Chinese)
隋兵才 . 基于真实历史反馈的自适应值预测器的设计与优化 [J ] . 计算机工程与科学 , 2021 , 43 ( 2 ): 274 - 279 .
SUI B C . Design and improvement of self-adaptive value predictor based on real history feedback [J ] . Computer Engineering & Science , 2021 , 43 ( 2 ): 274 - 279 . (in Chinese)
冀蓉 , 周宏伟 , 张民选 , 等 . 推测执行中值预测与指令重用技术的研究与分析 [J ] . 计算机工程与科学 , 2005 , 27 ( 11 ): 98 - 101 .
JI R , ZHOU H W , ZHANG M X , et al . Research and analysis of the value prediction and instruction reuse techniques in speculated execution [J ] . Computer Engineering & Science , 2005 , 27 ( 11 ): 98 - 101 . (in Chinese)
李笑天 , 郭德源 , 何虎 . 分支预测与值预测在VLIW处理器中的实现 [J ] . 微电子学与计算机 , 2015 , 32 ( 1 ): 54 - 59 .
LI X T , GUO D Y , HE H . Realization of branch prediction and value prediction in VLIW [J ] . Microelectronics & Computer , 2015 , 32 ( 1 ): 54 - 59 . (in Chinese)
冀蓉 , 张民选 , 陈怒兴 . 值预测技术中基本值预测模型的功耗分析 [J ] . 计算机工程与科学 , 2006 , 28 ( 4 ): 126 - 129 .
JI R , ZHANG M X , CHEN N X . A power consumption analysis of the basic value prediction models in value prediction techniques [J ] . Computer Engineering & Science , 2006 , 28 ( 4 ): 126 - 129 . (in Chinese)
党向磊 , 王箫音 , 佟冬 , 等 . 一种基于值预测和指令复用的按序处理器预执行机制 [J ] . 电子学报 , 2011 , 39 ( 12 ): 2880 - 2883 .
DANG X L , WANG X Y , TONG D , et al . A pre-execution mechanism based on value prediction and instruction reuse for in-order processors [J ] . Acta Electonica Sinica , 2011 , 39 ( 12 ): 2880 . (in Chinese)
HENNESSY J L , PATTERSON D A . Computer Architecture: A Quantitative Approach [M ] . Cambridge : Elsevier , 2011 .
GALLAGHER D M , CHEN W Y , MAHLKE S A , et al . Dynamic memory disambiguation using the memory conflict buffer [J ] . ACM SIGPLAN Notices , 1994 , 29 ( 11 ): 183 - 193 .
KIRIANSKY V , WALDSPURGER C . Speculative buffer overflows: Attacks and defenses [EB/OL ] . ( 2018-07-10 )[ 2023-03-01 ] . https://arxiv.org/pdf/1807.03757.pdf https://arxiv.org/pdf/1807.03757.pdf .
PONCE-DE-LEON H , KINDER J . Cats vs. spectre: An axiomatic approach to modeling speculative execution attacks [C ] // 2022 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE , 2022 : 235 - 248 .
PERAIS A , SEZNEC A . Practical data value speculation for future high-end processors [C ] // 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA) . Piscataway : IEEE , 2014 : 428 - 439 .
PERAIS A , SEZNEC A . EOLE: Paving the way for an effective implementation of value prediction [C ] // 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA) . Piscataway : IEEE , 2014 : 481 - 492 .
PERAIS A , SEZNEC A . EOLE: Combining static and dynamic scheduling through value prediction to reduce complexity and increase performance [J ] . ACM Transactions on Computer Systems , 2016 , 34 ( 2 ): 4 .
PERAIS A , SEZNEC A . EOLE: Toward a practical implementation of value prediction [J ] . IEEE Micro , 2015 , 35 ( 3 ): 114 - 124 .
ENDO F A , PERAIS A , SEZNEC A . On the interactions between value prediction and compiler optimizations in the context of EOLE [J ] . ACM Transactions on Architecture and Code Optimization , 2017 , 14 ( 2 ): 18 .
PERAIS A , SEZNEC A . BeBoP: A cost effective predictor infrastructure for superscalar value prediction [C ] // 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) . Piscataway : IEEE , 2015 : 13 - 25 .
PERAIS A . Increasing the Performance of Superscalar Processors Through Value Prediction [D ] . Rennes : Université de Rennes , 2015 .
SHEIKH R , CAIN H W , DAMODARAN R . Load value prediction via path-based address prediction: Avoiding mispredictions due to conflicting stores [C ] // Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . New York : ACM , 2017 : 423 - 435 .
SHUKLA S , BANDISHTE S , GAUR J , et al . Register file prefetching [C ] // Proceedings of the 49th Annual International Symposium on Computer Architecture . New York : ACM , 2022 : 410 - 423 .
OROSA L , AZEVEDO R , MUTLU O . AVPP: Address-first value-next predictor with value prefetching for improving the efficiency of load value prediction [J ] . ACM Transactions on Architecture and Code Optimization , 2018 , 15 ( 4 ): 49 .
ALVES R , KAXIRAS S , BLACK-SCHAFFER D . Early address prediction: Efficient pipeline prefetch and reuse [J ] . ACM Transactions on Architecture and Code Optimization , 2021 , 18 ( 3 ): 39 .
GONZÁLEZ J , GONZÁLEZ A . Speculative execution via address prediction and data prefetching [C ] // Proceedings of the 11th International Conference on Supercomputing . New York : ACM , 1997 : 196 - 203 .
ZHANG J S . The predictability of load address [J ] . ACM SIGARCH Computer Architecture News , 2001 , 29 ( 4 ): 19 - 28 .
BEKERMAN M , JOURDAN S , RONEN R , et al . Correlated load-address predictors [J ] . ACM SIGARCH Computer Architecture News , 1999 , 27 ( 2 ): 54 - 63 .
CHUNG B K , ZHANG J , PEIR J K , et al . Direct load: Dependence-linked dataflow resolution of load address and cache coordinate [C ] // Proceedings of the 34th ACM/IEEE International Symposium on Microarchitecture (MICRO) . New York : ACM , 2001 : 76 - 87 .
BAER J L , CHEN T F . Effective hardware-based data prefetching for high-performance processors [J ] . IEEE Transactions on Computers , 1995 , 44 ( 5 ): 609 - 623 .
HARRISON L . Examination of a memory access classification scheme for pointer-intensive and numeric programs [C ] // Proceedings of the 10th International Conference on Supercomputing - ICS’96 . New York : ACM 1996 : 133 - 140 .
EICKEMEYER R J , VASSILIADIS S . A load-instruction unit for pipelined processors [J ] . IBM Journal of Research and Development , 1993 , 37 ( 4 ): 547 - 564 .
BURTSCHER M , ZORN B G . Hybrid load-value predictors [J ] . IEEE Transactions on Computers , 2002 , 51 ( 7 ): 759 - 774 .
RYCHLIK B , FAISTL J , KRUG B , et al . Efficient and Accurate Value Prediction Using Dynamic Classification [R ] . Pittsburgh : Carneige Mellon University , 1998 .
SHIMOMURA Y , KOBAYASHI R . A stride value predictor suppressing conflicts focusing on predictability [J ] . IEEE Transactions on Electronics, Information and Systems , 2011 , 131 ( 6 ): 1260 - 1270 .
LIPASTI M H . Value Locality and Speculative Execution [D ] . Pittsburgh : Carnegie Mellon University , 1997 .
LOH G H . Width prediction for reducing value predictor size and power [C ] // Proceedings of the First Value-Prediction Workshop . New York : ACM , 2003 : 86 - 93 .
WANG K , FRANKLIN M . Highly accurate data value prediction using hybrid predictors [C ] // Proceedings of the 30th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . Piscataway : IEEE , 1997 : 281 - 290 .
CHANG S , TIAN F . An Efficient Implementation of Hybrid Data Value Predictors with Confidence Estimation [R ] . Berkeley : University of California , 1998 .
BODINE J T . Exploiting Computational Locality in Global Value Histories [D ] . Raleigh : North Carolina State University , 2002 .
ZHOU H Y , FLANAGAN J , CONTE T M . Detecting global stride locality in value streams [C ] // Proceedings of the 30th annual international symposium on Computer architecture . Piscataway : IEEE , 2003 : 324 - 335 .
GUNAL U . The Effectiveness of Global Difference Value Prediction And Memory Bus Priority Schemes for Speculative Prefetch [D ] . Raleigh : North Carolina State University , 2003 .
NAKRA T , GUPTA R , SOFFA M L . Global context-based value prediction [C ] // Proceedings of the Fifth International Symposium on High-Performance Computer Architecture . Piscataway : IEEE , 1999 : 4 - 12 .
SEZNEC A . Tage-sc-l branch predictors again [EB/OL ] . ( 2016-06-18 )[ 2023-03-01 ] . https://jilp.org/cbp2016/paper/AndreSeznecLimited.pdf https://jilp.org/cbp2016/paper/AndreSeznecLimited.pdf .
SEZNEC A . Tage-sc-l branch predictors [EB/OL ] . ( 2014-06-15 )[ 2023-03-01 ] . https://jilp.org/cbp2014/paper/AndreSeznec.pdf https://jilp.org/cbp2014/paper/AndreSeznec.pdf .
NADERAN-TAHAN M , SARBAZI-AZAD H . Adaptive prefetching using global history buffer in multicore processors [J ] . The Journal of Supercomputing , 2014 , 68 ( 3 ): 1302 - 1320 .
XIAO Y , YANG Y P , ZHOU X M . Revised stride data value predictor design [C ] // Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA’05) . Piscataway : IEEE , 2005 : 526 - 531 .
RILEY N , ZILLES C . Probabilistic counter updates for predictor hysteresis and stratification [C ] // The Twelfth International Symposium on High-Performance Computer Architecture . Piscataway : IEEE , 2006 : 110 - 120 .
DIEP T A , SHEN J P . VMW: A visualization-based microarchitecture workbench [J ] . Computer , 1995 , 28 ( 12 ): 57 - 64 .
CMELIK B , KEPPEL D . Shade: A fast instruction-set simulator for execution profiling [C ] // Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems . New York : ACM , 1994 : 128 - 137 .
BURGER D , AUSTIN T M . The SimpleScalar tool set, version 2.0 [J ] . ACM SIGARCH Computer Architecture News , 1997 , 25 ( 3 ): 13 - 25 .
CHEN R , QIN Z , YANG F . Data Value Prediction Methods and Performance [R ] . Madison : University of Wisconsin , 1998 .
SEZNEC A , KALAITZIDIS K . Exploring value prediction limits [C ] // CVP 2020-Championship Value Prediction . New York : ACM , 2020 : 1 - 5 .
GUPTA A , MOR P , TANEJA H , et al . Steves: Pushing the limits of value predictors with sliding fcm and eves [EB/OL ] . ( 2019-05-01 )[ 2023-03-01 ] . https://www.microarch.org/cvp1/papers/Gupta_Mor_Taneja_Panda.pdf https://www.microarch.org/cvp1/papers/Gupta_Mor_Taneja_Panda.pdf .
BELL T C , CLEARY J G , WITTEN I H . Text Compression [M ] . Englewood Cliffs : Prentice Hall , 1990 .
GOEMAN B , VANDIERENDONCK H , DE BOSSCHERE K . Differential FCM: Increasing value prediction accuracy by improving table usage efficiency [C ] // Proceedings of the 7th International Symposium on High-Performance Computer Architecture . New York : ACM , 2001 : 207 - 216 .
SAZEIDES Y , SMITH J E . Implementations of Context Based Value Predictors [R ] . Madison : University of Wisconsin , 1997 .
BURTSCHER M . An improved index function for (D) FCM predictors [J ] . ACM SIGARCH Computer Architecture News , 2002 , 30 ( 3 ): 19 - 24 .
DESHMUKH N , VERMA S , AGRAWAL P , et al . Dfcm++: Augmenting dfcm with early update and data dependence-driven value estimation [EB/OL ] . ( 2018-06-03 )[ 2023-03-01 ] . https://www.microarch.org/cvp1/papers/Deshmukh.pdf https://www.microarch.org/cvp1/papers/Deshmukh.pdf .
KOIZUMI K , HIRAKI K , INABA M . H 3 VP: History based highly reliable hybrid value predictor [EB/OL ] . ( 2018-06-03 )[ 2023-03-01 ] . https://www.microarch.org/cvp1/papers/Koizumi.pdf https://www.microarch.org/cvp1/papers/Koizumi.pdf .
SAKHUJA C , SUBRAMANIAN A , JOSHI P , et al . Combining branch history and value history for improved value prediction [EB/OL ] . ( 2019-11-01 )[ 2023-03-01 ] . https://www.microarch.org/cvp1/papers/ChiragSakhuja.pdf https://www.microarch.org/cvp1/papers/ChiragSakhuja.pdf .
JOSHI P B . Techniques for Advancing Value Prediction [D ] . Austin : The University of Texas , 2019 .
SUBRAMANIAN A . Advancing Value Prediction [D ] . Austin : The University of Texas , 2019 .
ZHOU C B , HUANG L B , LI Z S , et al . Design space exploration of TAGE branch predictor with ultra-small RAM [C ] // Proceedings of the on Great Lakes Symposium on VLSI 2017 . New York : ACM , 2017 : 281 - 286 .
MICHAUD P . An alternative TAGE-like conditional branch predictor [J ] . ACM Transactions on Architecture and Code Optimization , 2018 , 15 ( 3 ): 30 .
SUGGS D , SUBRAMONY M , BOUVIER D . The AMD “zen 2” processor [J ] . IEEE Micro , 2020 , 40 ( 2 ): 45 - 52 .
SEZNEC A . A 64 -Kbytes ITTAGE indirect branch predictor [EB/OL ] . ( 2011-06-04 )[ 2023-03-01 ] . https://jilp.org/jwac-2/program/cbp3_07_seznec.pdf https://jilp.org/jwac-2/program/cbp3_07_seznec.pdf .
PERAIS A . Exploiting Value Prediction With Quasi-Unlimited Resources [R ] . Rennes : INRIA-IRISA Rennes Bretagne Atlantique, équipe ALF , 2012 .
PERAIS A , SEZNEC A . Revisiting Value Prediction [R ] . Rennes : INRIA-IRISA Rennes Bretagne Atlantique, équipe ALF , 2012 .
SATO T , ARITA I . Table size reduction for data value predictors by exploiting narrow width values [C ] // Proceedings of the 14th international conference on Supercomputing . New York : ACM , 2000 : 196 - 205 .
PERAIS A . A Case for Speculative Strength Reduction [J ] . IEEE Computer Architecture Letters , 2021 , 20 ( 1 ): 22 - 25 .
PERAIS A . Leveraging targeted value prediction to unlock new hardware strength reduction potential [C ] // MICRO-54: Proceedings of the 54th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . New York : ACM , 2021 : 792 - 803 .
COOPER K D , SIMPSON L T , VICK C A . Operator strength reduction [J ] . ACM Transactions on Programming Languages and Systems , 2001 , 23 ( 5 ): 603 - 625 .
YANG L , HUANG L B , YAN R , et al . Stride equality prediction for value speculation [J ] . IEEE Computer Architecture Letters , 2022 , 21 ( 2 ): 57 - 60 .
PERAIS A , ENDO F A , SEZNEC A . Register sharing for equality prediction [C ] // Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . Piscataway : IEEE , 2016 : 1 - 12 .
TULLSEN D M , SENG J S . Storageless value prediction using prior register values [C ] // Proceedings of the 26th Annual International Symposium on Computer Architecture . New York : ACM , 1999 : 270 - 279 .
BINKERT N , BECKMANN B , BLACK G , et al . The gem5 simulator [J ] . ACM SIGARCH Computer Architecture News , 2011 , 39 ( 2 ): 1 - 7 .
Release CVP 2 v 2 .2 Infrastructure eric-rotenberg [EB/OL ] . ( 2022-12-04 )[ 2023-08-01 ] . https://github.com/eric-rotenberg/CVP/releases/tag/cvp2v2.2 https://github.com/eric-rotenberg/CVP/releases/tag/cvp2v2.2 .
ZHOU H , FU C , ROTENBERG E , et al . A Study of Value Speculative Execution and Misspeculation Recovery in Superscalar Microprocessors [R ] . Raleigh : North Carolina State University , 2000 .
VINŢAN L N . Value prediction and speculation into the next microprocessors generation [J ] // Proceedings of the Romanian Academy , 2004 , 5 ( 3 ): 1 - 8 .
REINMAN G , CALDER B . Predictive techniques for aggressive load speculation [C ] // Proceedings of the 31st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . Piscataway : IEEE , 1998 : 127 - 137 .
BALKAN D , KALAMATIANOS J , KAELI D . A study of errant pipeline flushes caused by value misspeculation [C ] // Proceedings of the 16th Symposium on Computer Architecture and High Performance Computing . Piscataway : IEEE , 2004 : 32 - 39 .
FU C Y , JENNINGS M D , LARIN S Y , et al . Value speculation scheduling for high performance processors [C ] // Proceedings of the eighth international conference on Architectural support for programming languages and operating systems . New York : ACM , 1998 : 262 - 271 .
FU C , JENNINGS M D , LARIN S Y , et al . Software-only Value Speculation Scheduling [R ] . Raleigh : Department of Electrical and Computer Engineering, North Carolina State University , 1998 .
SAZEIDES Y . Modeling value speculation [C ] // Proceedings Eighth International Symposium on High Performance Computer Architecture . New York : IEEE , 2002 : 211 - 222 .
SALEHI M , Baniasadi A . Storage-aware value prediction [C ] // 2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools . Washington : IEEE , 2010 : 722 - 728 .
BURTSCHER M , DIWAN A , HAUSWIRTH M . Static load classification for improving the value predictability of data-cache misses [J ] . ACM SIGPLAN Notices , 2002 , 37 ( 5 ): 222 - 233 .
GELLERT A , PALERMO G , ZACCARIA V , et al . Energy-performance design space exploration in SMT architectures exploiting selective load value predictions [C ] // 2010 Design , Automation & Test in Europe Conference & Exhibition (DATE 2010 ). Piscataway : IEEE , 2010 : 271 - 274 .
ISLAM M M , STENSTROM P . Zero loads: Canceling load requests by tracking zero values [C ] // Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, Systems and Architecture . New York : ACM , 2008 : 16 - 23 .
MORENO R , PINUEL L , DEL PINO S , et al . A power perspective of value speculation for superscalar microprocessors [C ] // Proceedings of the 2000 International Conference on Computer Design . Piscataway : IEEE , 2000 : 147 - 154 .
MORENO R , PINUEL L , DEL PINO S , et al . Power-efficient value speculation for high-performance microprocessors [C ] // Proceedings of the 26th Euromicro Conference . EUROMICRO 2000 , Informatics: Inventing the Future . Piscataway : IEEE , 2000: 292 - 299 .
CEBRIAN J M , ARAGON J L , GARCIA J M , et al . Adaptive VP decay: Making value predictors leakage-efficient designs for high performance processors [C ] // Proceedings of the 4th International Conference on Computing Frontiers . New York : ACM , 2007 : 113 - 122 .
CEBRIAN J M , ARAGON J L , GARCIA J M . Leakage energy reduction in value predictors through static decay [C ] // 2007 IEEE International Parallel and Distributed Processing Symposium . Piscataway : IEEE , 2007 : 1 - 7 .
CEBRIÁN J M , ARAGÓN J L , GARCÍA J M , et al . Leakage-efficient design of value predictors through state and non-state preserving techniques [J ] . The Journal of Supercomputing , 2011 , 55 ( 1 ): 28 - 50 .
BURTSCHER M , ZORN B G . Hybridizing and coalescing load value predictors [C ] // Proceedings of the 2000 International Conference on Computer Design . Piscataway : IEEE , 2000 : 81 - 92 .
WANG H N , IBRAHIM M , MITTAL S , et al . Address-stride assisted approximate load value prediction in GPUs [C ] // Proceedings of the ACM International Conference on Supercomputing . New York : ACM , 2019 : 184 - 194 .
YAZDANBAKHSH A , PEKHIMENKO G , THWAITES B , et al . RFVP: Rollback-free value prediction with safe-to-approximate loads [J ] . ACM Transactions on Architecture and Code Optimization , 2016 , 12 ( 4 ): 62 .
GHANDOUR W J , AKKARY H , MASRI W . Leveraging strength-based dynamic information flow analysis to enhance data value prediction [J ] . ACM Transactions on Architecture and Code Optimization , 2012 , 9 ( 1 ): 1 - 33 .
NAITHANI A , FELIU J , ADILEH A , et al . Precise runahead execution [J ] . IEEE Computer Architecture Letters , 2019 , 18 ( 1 ): 71 - 74 .
NAITHANI A , AINSWORTH S , JONES T M , et al . Vector runahead [C ] // 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) . Piscataway : IEEE , 2021 : 195 - 208 .
NAITHANI A , EECKHOUT L . Reliability-aware runahead [C ] // 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA) . Piscataway : IEEE , 2022 : 772 - 785 .
PRUETT S , PATT Y . Branch runahead: An alternative to branch prediction for impossible to predict branches [C ] // MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture . New York : ACM , 2021 : 804 - 815 .
杨智杰 , 王蕾 , 石伟 , 等 . 类脑处理器异步片上网络架构 [J ] . 计算机研究与发展 , 2023 , 60 ( 1 ): 17 - 29 .
YANG Z J , WANG L , SHI W , et al . Asynchronous network-on-chip architecture for neuromorphic processor [J ] . Journal of Computer Research and Development , 2023 , 60 ( 1 ): 17 - 29 . (in Chinese)
GOPE D , LIPASTI M H . Bias-free branch predictor [C ] // Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . Piscataway : IEEE , 2014 : 521 - 532 .
JJIMÉNEZ D A . Fast path-based neural branch prediction [C ] // Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . New York : ACM , 2003 : 243 - 252 .
NAIN S , CHAUDHARY P . A neural network-based approach for the performance evaluation of branch prediction in instruction-level parallelism processors [J ] . The Journal of Supercomputing , 2022 , 78 ( 4 ): 4960 - 4976 .
LURBE M , FELIU J , PETIT S , et al . DeepP: Deep learning multi-program prefetch configuration for the IBM POWER 8 [J ] . IEEE Transactions on Computers , 2022 , 71 ( 10 ): 2646 - 2658 .
BLACK M , FRANKLIN M . Neural Confidence Estimation for More Accurate Value Prediction [M ] // Lecture Notes in Computer Science . Berlin, Heidelberg : Springer , 2005 : 376 - 385 .
MOHAPATRA S M , MISHRA P K . A novel approach for confidence estimation using support vector machines for more accurate value prediction [J ] . International Journal of Computer Applications ETCC - 2014 , 2014 ( 1 ): 51 - 58
MISHRA P K , MOHAPATRA S M . More accurate value prediction using neural methods [J ] . Elixir International Journal . 2014 , 74 : 26657 - 26663 .
BHARGAVA R , JOHN L K . Latency and energy aware value prediction for high-frequency processors [C ] // Proceedings of the 16th international conference on Supercomputing . New York : ACM , 2002 : 45 - 56 .
XIONG W J , SZEFER J . Survey of transient execution attacks and their mitigations [J ] . ACM Computing Surveys , 2021 , 54 ( 3 ): 54 .
YAN M J , CHOI J , SKARLATOS D , et al . InvisiSpec: Making speculative execution invisible in the cache hierarchy [C ] // 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . Piscataway : IEEE , 2018 : 428 - 441 .
ZHAO Z N , JI H X , YAN M J , et al . Speculation invariance (InvarSpec): Faster safe execution through program analysis [C ] // 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) . Piscataway : IEEE , 2020 : 1138 - 1152 .
SAKALIS C , ALIPOUR M , ROS A , et al . Ghost loads: What is the cost of invisible speculation? [C ] // Proceedings of the 16th ACM International Conference on Computing Frontiers . New York : ACM , 2019 : 153 - 163 .
SAKALIS C , KAXIRAS S , ROS A , et al . Efficient invisible speculative execution through selective delay and value prediction [C ] // Proceedings of the 46th International Symposium on Computer Architecture . New York : ACM , 2019 : 723 - 735 .
SAKALIS C , KAXIRAS S , ROS A , et al . Understanding selective delay as a method for efficient secure speculative execution [J ] . IEEE Transactions on Computers , 2020 , 69 ( 11 ): 1584 - 1595 .
BAR-EL H , CHOUKRI H , NACCACHE D , et al . The sorcerer’s apprentice guide to fault attacks [J ] . Proceedings of the IEEE , 2006 , 94 ( 2 ): 370 - 382 .
SHEIKH R , CAMMAROTA R , RUAN W J . Value prediction for security (VPsec): Countering fault attacks in modern microprocessors [C ] // 2018 IEEE International Symposium on Hardware Oriented Security and Trust (HOST) . Piscataway : IEEE , 2018 : 235 - 238 .
CAMMAROTA R , SHEIKH R . VPsec: Countering fault attacks in general purpose microprocessors with value prediction [C ] // Proceedings of the 15th ACM International Conference on Computing Frontiers . New York : ACM , 2018 : 191 - 199 .
SHEIKH R , CAMMAROTA R . Improving performance and mitigating fault attacks using value prediction [J ] . Cryptography , 2018 , 2 ( 4 ): 27 .
ZHAO Q , LILJA D J . Static classification of value predictability using compiler hints [J ] . IEEE Transactions on Computers , 2004 , 53 ( 8 ): 929 - 944 .
SAM N B , BURTSCHER M . Improving memory system performance with energy-efficient value speculation [J ] . ACM SIGARCH Computer Architecture News , 2005 , 33 ( 4 ): 121 - 127 .
ZHOU H , CONTE T M . Enhancing memory level parallelism via recovery-free value prediction [C ] // Proceedings of the 17th Annual International Conference on Supercomputing . New York : ACM , 2003 : 326 - 335 .
0
浏览量
16
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621