当前位置: X-MOL 学术ACM Trans. Archit. Code Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cerberus: Triple Mode Acceleration of Sparse Matrix and Vector Multiplication
ACM Transactions on Architecture and Code Optimization ( IF 1.6 ) Pub Date : 2024-03-17 , DOI: 10.1145/3653020
Soojin Hwang 1 , Daehyeon Baek 1 , Jongse Park 1 , Jaehyuk Huh 1
Affiliation  

The multiplication of sparse matrix and vector (SpMV) is one of the most widely used kernels in high-performance computing as well as machine learning acceleration for sparse neural networks. The design space of SpMV accelerators has two axes: algorithm and matrix representation. There have been two widely used algorithms and data representations. Two algorithms, scalar multiplication and dot product, can be combined with two sparse data representations, compressed sparse and bitmap formats for the matrix and vector. Although the prior accelerators adopted one of the possible designs, it is yet to be investigated which design is the best one across different hardware resources and workload characteristics. This paper first investigates the impact of design choices with respect to the algorithm and data representation. Our evaluation shows that no single design always outperforms the others across different workloads, but the two best designs (i.e. compressed sparse format and bitmap format with dot product) have complementary performance with trade-offs incurred by the matrix characteristics. Based on the analysis, this study proposes Cerberus, a triple-mode accelerator supporting two sparse operation modes in addition to the base dense mode. To allow such multi-mode operation, it proposes a prediction model based on matrix characteristics under a given hardware configuration, which statically selects the best mode for a given sparse matrix with its dimension and density information. Our experimental results show that Cerberus provides 12.1 × performance improvements from a dense-only accelerator, and 1.5 × improvements from a fixed best SpMV design.



中文翻译:

Cerberus:稀疏矩阵和向量乘法的三重模式加速

稀疏矩阵和向量的乘法(SpMV)是高性能计算以及稀疏神经网络机器学习加速中最广泛使用的内核之一。SpMV加速器的设计空间有两个轴:算法和矩阵表示。有两种广泛使用的算法和数据表示。标量乘法和点积两种算法可以与矩阵和向量的两种稀疏数据表示、压缩稀疏和位图格式组合。尽管现有的加速器采用了其中一种可能的设计,但对于不同的硬件资源和工作负载特征,哪种设计是最佳设计还有待研究。本文首先研究了设计选择对算法和数据表示的影响。我们的评估表明,没有一种设计在不同的工作负载下总是优于其他设计,但是两种最佳设计(即压缩稀疏格式和点积位图格式)具有互补的性能,但由于矩阵特性而产生了权衡。基于分析,本研究提出了 Cerberus,一种三模式加速器,除了基本密集模式之外,还支持两种稀疏操作模式。为了允许这种多模式操作,它提出了一种基于给定硬件配置下的矩阵特征的预测模型,该模型利用给定的稀疏矩阵的维数和密度信息静态地选择最佳模式。我们的实验结果表明,与仅密集加速器相比,Cerberus 的性能提高了 12.1 倍,与固定最佳 SpMV 设计相比,性能提高了 1.5 倍。

更新日期:2024-03-17
down
wechat
bug