当前位置: X-MOL 学术ACM Trans. Math. Softw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Newly Released Capabilities in the Distributed-Memory SuperLU Sparse Direct Solver
ACM Transactions on Mathematical Software ( IF 2.7 ) Pub Date : 2023-03-21 , DOI: https://dl.acm.org/doi/10.1145/3577197
Xiaoye S. Li, Paul Lin, Yang Liu, Piyush Sao

We present the new features available in the recent release of SuperLU_DIST, Version 8.1.1. SuperLU_DIST is a distributed-memory parallel sparse direct solver. The new features include (1) a 3D communication-avoiding algorithm framework that trades off inter-process communication for selective memory duplication, (2) multi-GPU support for both NVIDIA GPUs and AMD GPUs, and (3) mixed-precision routines that perform single-precision LU factorization and double-precision iterative refinement. Apart from the algorithm improvements, we also modernized the software build system to use CMake and Spack package installation tools to simplify the installation procedure. Throughout the article, we describe in detail the pertinent performance-sensitive parameters associated with each new algorithmic feature, show how they are exposed to the users, and give general guidance of how to set these parameters. We illustrate that the solver’s performance both in time and memory can be greatly improved after systematic tuning of the parameters, depending on the input sparse matrix and underlying hardware.



中文翻译:

分布式内存 SuperLU 稀疏直接求解器中新发布的功能

我们介绍了最新版本中可用的新功能超级LU_DIST, 版本 8.1.1。超级LU_DIST是分布式内存并行稀疏直接求解器。新功能包括 (1) 一个 3D 通信避免算法框架,该框架为选择性内存复制权衡进程间通信,(2) 对 NVIDIA GPU 和 AMD GPU 的多 GPU 支持,以及 (3) 混合精度例程执行单精度 LU 分解和双精度迭代细化。除了算法改进外,我们还对软件构建系统进行了现代化改造,以使用 CMake 和 Spack 包安装工具来简化安装过程。在整篇文章中,我们详细描述了与每个新算法功能相关的对性能敏感的相关参数,展示了它们如何暴露给用户,并给出了如何设置这些参数的一般指导。

更新日期:2023-03-21
down
wechat
bug