The Journal of Supercomputing ( IF 3.3 ) Pub Date : 2024-04-15 , DOI: 10.1007/s11227-024-06097-7 Inmaculada Santamaria-Valenzuela , Rocío Carratalá-Sáez , Yuri Torres , Diego R. Llanos , Arturo Gonzalez-Escribano
There are many works devoted to improving the matrix product computation, as it is used in a wide variety of scientific applications arising from many different fields. In this work, we propose alternative data distribution policies and communication patterns to reduce the elapsed time when computing triangular matrix products in distributed memory environments. In particular, we focus on commodity clusters, where the number of nodes is limited, proposing alternatives to traditional approaches in order to improve this operation’s performance. Our proposal overcomes the performance results associated with the state-of-the-art libraries, such as ScaLAPACK and SLATE, offering execution times that are up to 30% faster.
中文翻译:
商品集群中三角矩阵乘积的性能提升
有许多工作致力于改进矩阵乘积计算,因为它被用于许多不同领域的各种科学应用。在这项工作中,我们提出了替代数据分布策略和通信模式,以减少在分布式内存环境中计算三角矩阵乘积时所花费的时间。我们特别关注节点数量有限的商品集群,提出传统方法的替代方案,以提高该操作的性能。我们的建议克服了与最先进的库(例如 ScaLAPACK 和 SLATE)相关的性能结果,将执行时间提高了 30%。