当前位置: X-MOL 学术ACM Trans. Archit. Code Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Extension VM: Interleaved Data Layout in Vector Memory
ACM Transactions on Architecture and Code Optimization ( IF 1.6 ) Pub Date : 2023-11-07 , DOI: 10.1145/3631528
Dunbo Zhang 1 , Qingjie Lang 1 , Ruoxi Wang 1 , Li Shen 2
Affiliation  

While vector architecture is widely employed in processors for neural networks, signal processing, and high-performance computing; however, its performance is limited by inefficient column-major memory access. The column-major access limitation originates from the unsuitable mapping of multidimensional data structures to two-dimensional vector memory spaces. In addition, the traditional data layout mapping method creates an irreconcilable conflict between row- and column-major accesses. Ideally, both row- and column-major accesses can take advantage of the bank parallelism of vector memory.

To this end, we propose the Interleaved Data Layout (IDL) method in vector memory, which can distribute vector elements into different banks regardless of whether they are in the row- or column major category, so that any vector memory access can benefit from bank parallelism. Additionally, we propose an Extension Vector Memory (EVM) architecture to achieve IDL in vector memory. EVM can support two data layout methods and vector memory access modes simultaneously. The key idea is to continuously distribute the data that needs to be accessed from the main memory to different banks during the loading period. Thus, EVM can provide a larger spatial locality level through careful programming and the extension ISA support.

The experimental results showed a 1.43-fold improvement of state-of-the-art vector processors by the proposed architecture, with an area cost of only 1.73%. Furthermore, the energy consumption was reduced by 50.1%.



中文翻译:

扩展 VM:向量内存中的交错数据布局

而矢量架构广泛应用于神经网络、信号处理和高性能计算的处理器中;然而,其性能受到低效的列优先内存访问的限制。列优先访问限制源于多维数据结构到二维向量存储空间的不适当映射。此外,传统的数据布局映射方法在行主访问和列主访问之间造成了不可调和的冲突。理想情况下,行优先访问和列优先访问都可以利用向量存储器的存储体并行性。

为此,我们提出了向量存储器中的交错数据布局(IDL)方法,该方法可以将向量元素分配到不同的bank中,无论它们属于行主类别还是列主类别,以便任何向量存储器访问都可以从bank中受益并行性。此外,我们提出了一种扩展向量内存(EVM)架构来实现向量内存中的 IDL。EVM 可以同时支持两种数据布局方法和向量存储器访问模式。关键思想是在加载期间不断地将需要从主存访问的数据分发到不同的Bank。因此,EVM可以通过仔细的编程和扩展ISA支持来提供更大的空间局部性级别。

实验结果表明,所提出的架构将最先进的矢量处理器提高了 1.43 倍,而面积成本仅为 1.73%。此外,能源消耗降低了50.1%。

更新日期:2023-11-07
down
wechat
bug