1. 北京大学微处理器研究开发中心,北京,100871
2. 北京大学微处理器及系统教育部工程研究中心,北京,100871
纸质出版:2012
移动端阅览
党向磊, 王箫音, 佟冬, 等. 面向按序执行处理器的预执行指导的数据预取方法[J]. 电子学报, 2012,40(11):2145-2151.
DANG Xiang-lei, WANG Xiao-yin, TONG Dong, et al. Pre-Execution Directed Prefetching for In-Order Processors[J]. Acta Electronica Sinica, 2012, 40(11): 2145-2151.
党向磊, 王箫音, 佟冬, 等. 面向按序执行处理器的预执行指导的数据预取方法[J]. 电子学报, 2012,40(11):2145-2151. DOI: 10.3969/j.issn.0372-2112.2012.11.001.
DANG Xiang-lei, WANG Xiao-yin, TONG Dong, et al. Pre-Execution Directed Prefetching for In-Order Processors[J]. Acta Electronica Sinica, 2012, 40(11): 2145-2151. DOI: 10.3969/j.issn.0372-2112.2012.11.001.
为提高按序执行处理器的访存性能
本文提出一种预执行指导的数据预取方法(PEDP).PEDP利用跨距预取器对规则的访存模式进行预取
并在发生L2 Cache失效后通过预执行后续指令对不规则的访存模式进行精确的预取
从而结合两者的优势提高预取覆盖率.同时
PEDP利用预执行过程中提前捕获的真实访存信息指导跨距预取器的预取过程.在预执行的指导下
跨距预取器可以对预执行能够产生的符合跨距访存模式的地址更早地发起预取请求
从而改善预取及时性.此外
为进一步优化上述指导过程
PEDP使用更新过滤器有效去除指导过程中对跨距预取器的有害更新
从而提高预取准确率.实验结果表明
在平均情况下
PEDP将基准处理器的性能提升33.0%.与跨距预取和预执行各自单独使用相比
PEDP将性能分别提高16.2%和7.3%.
This paper proposes a pre-execution directed prefetching(PEDP) method to improve the memory latency tolerance of in-order processors.PEDP utilizes stride prefetching to handle regular access patterns and pre-execution to generate accurate prefetches regardless of the regularity of access patterns when a L2 cache miss occurs
which combines the advantages of the two techniques to improve the prefetch coverage.Meanwhile
PEDP captures actual memory access patterns during pre-execution to guide the stride prefetcher's update process.Under the guide of pre-execution
the stride prefetcher can issue prefetches earlier than pre-execution for addresses that can be generated by both of the two techniques
thus improving the prefetch timeliness.In addition
PEDP achieves improvement in prefetch accuracy by an update filter which effectively eliminates the harmful updates to the stride prefetcher during the guide process.Experimental results demonstrate that PEDP increases the performance by 33.0% over the baseline processor.Compared with stride prefetching and pre-execution
PEDP improves the performance by 16.2% and 7.3%
respectively.
0
浏览量
1908
下载量
1
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621