当前位置: X-MOL 学术Int. J. Parallel. Program › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Automatic Discovery of Collective Communication Patterns in Parallelized Task Graphs
International Journal of Parallel Programming ( IF 1.5 ) Pub Date : 2024-03-22 , DOI: 10.1007/s10766-024-00767-y
Fabian Knorr , Philip Salzmann , Peter Thoman , Thomas Fahringer

Collective communication APIs equip MPI vendors with the necessary context to optimize cluster-wide operations on the basis of theoretical complexity models and characteristics of the involved interconnects. Modern HPC runtime systems with a programmability focus can perform dependency analysis to eliminate the need for manual communication entirely. Profiting from optimized collective routines in this context often requires global analysis of the implicit point-to-point communication pattern or tight constrains on the data access patterns allowed inside kernels. The Celerity API provides a high degree of freedom for both runtime implementors and application developers by tieing transparent work assignment to data access patterns through user-defined range-mapper functions. Canonically, data dependencies are resolved through an intra-node coherence model and inter-node point-to-point communication. This paper presents Collective Pattern Discovery (CPD), a fully distributed, coordination-free method for detecting collective communication patterns on parallelized task graphs. Through extensive scheduling and communication microbenchmarks as well as a strong scaling experiment on a compute-intensive application, we demonstrate that CPD can achieve substantial performance gains in the Celerity model.



中文翻译:

并行任务图中集体通信模式的自动发现

集体通信 API 为 MPI 供应商提供了必要的上下文,以便根据理论复杂性模型和所涉及互连的特征来优化集群范围的操作。以可编程性为重点的现代 HPC 运行时系统可以执行依赖性分析,以完全消除手动通信的需要。在这种情况下,从优化的集体例程中获益通常需要对隐式点对点通信模式进行全局分析,或者对内核内部允许的数据访问模式进行严格限制。 Celerity API 通过用户定义的范围映射器函数将透明的工作分配与数据访问模式联系起来,为运行时实现者和应用程序开发人员提供了高度的自由度。通常,数据依赖性是通过节点内一致性模型和节点间点对点通信来解决的。本文提出了集体模式发现(CPD),这是一种完全分布式、无协调的方法,用于检测并行任务图上的集体通信模式。通过广泛的调度和通信微基准测试以及对计算密集型应用程序的强大扩展实验,我们证明 CPD 可以在 Celerity 模型中实现显着的性能提升。

更新日期:2024-03-22
down
wechat
bug