On the interactions between ILP and TLP with hardware transactional memory,Microprocessors and Microsystems

当前位置： X-MOL 学术 › Microprocess. Microsyst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On the interactions between ILP and TLP with hardware transactional memory
Microprocessors and Microsystems ( IF 2.6 ) Pub Date : 2023-11-19 , DOI: 10.1016/j.micpro.2023.104975
Víctor Nicolás-Conesa , Rubén Titos-Gil , Ricardo Fernández-Pascual , Alberto Ros , Manuel E. Acacio

Hardware implementations of Transactional Memory (HTM) are designed to facilitate efficient thread synchronization in parallel programs, encouraging the use of larger critical sections. By employing optimistic concurrency control to execute transactions speculatively, HTM systems promise to deliver the performance benefits typically associated with fine-grained locks. In doing so, HTM systems must deal with transaction aborts. While under certain conditions aborts may be caused by the inherent limitations of hardware structures employed to implement TM (e.g., caches), conflicting concurrent accesses to shared memory locations are generally the prevailing cause for squashing the work done by a transaction

In this study, we present what we believe to be, to the best of our knowledge, the first characterization of how the aggressiveness of processor cores, particularly their ability to exploit instruction-level parallelism (ILP), interacts with the support for optimistic thread-level speculation offered by HTM systems. We have observed that by adjusting the size of structures that facilitate out-of-order and speculative execution, the number of aborts in the execution of transactional workloads can be altered in best-effort HTM implementations. Our findings indicate that in scenarios with high contention, a smaller number of powerful cores is more suitable, whereas in low contention scenarios, using a larger number of less aggressive cores is preferable. In addition, HTM systems that employ lazy detection and those employing eager detection with requester-stalls resolution, benefit from using simpler cores. In conclusion, abort ratios can be reduced with a careful choice of both processor aggressiveness and design aspects for each application depending on its contention.

中文翻译：

关于ILP和TLP与硬件事务内存之间的交互

事务内存 (HTM) 的硬件实现旨在促进并行程序中的高效线程同步，鼓励使用更大的关键部分。通过采用乐观并发控制来推测性地执行事务，HTM 系统承诺提供通常与细粒度锁相关的性能优势。为此，HTM 系统必须处理事务中止。虽然在某些条件下，中止可能是由用于实现 TM 的硬件结构（例如缓存）的固有限制引起的，但对共享内存位置的冲突并发访问通常是压缩事务完成的工作的主要原因

在这项研究中，我们据我们所知，首次描述了处理器内核的攻击性，特别是它们利用指令级并行性 (ILP) 的能力，如何与乐观线程的支持相互作用HTM 系统提供的级别推测。我们观察到，通过调整促进无序和推测执行的结构的大小，可以在尽力而为的 HTM 实现中改变事务工作负载执行中的中止次数。我们的研究结果表明，在高争用的场景中，较少数量的强大核心更合适，而在低争用的场景中，使用更多数量的低攻击性核心更合适。此外，采用惰性检测的 HTM 系统和采用请求者停顿解决方案的急切检测的 HTM 系统都受益于使用更简单的内核。总之，根据每个应用程序的争用，仔细选择处理器的攻击性和设计方面，可以降低中止率。

更新日期：2023-11-23

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>