当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
G-Learned Index: Enabling Efficient Learned Index on GPU
IEEE Transactions on Parallel and Distributed Systems ( IF 5.3 ) Pub Date : 2024-04-02 , DOI: 10.1109/tpds.2024.3381214
Jiesong Liu 1 , Feng Zhang 1 , Lv Lu 1 , Chang Qi 1 , Xiaoguang Guo 1 , Dong Deng 2 , Guoliang Li 3 , Huanchen Zhang 1 , Jidong Zhai 3 , Hechen Zhang 4 , Yuxing Chen 5 , Anqun Pan 5 , Xiaoyong Du 1
Affiliation  

AI and GPU technologies have been widely applied to solve Big Data problems. The total data volume worldwide reaches 200 zettabytes in 2022. How to efficiently index the required content among massive data becomes serious. Recently, a promising learned index has been proposed to address this challenge: It has extremely high efficiency while retaining marginal space overhead. However, we notice that previous learned indexes have mainly focused on CPU architecture, while ignoring the advantages of GPU. Because traditional indexes like B-Tree, LSM, and bitmap have greatly benefited from GPU acceleration, a combination of a learned index and GPU has great potentials to reach tremendous speedups. In this paper, we propose a GPU-based learned index, called G-Learned Index, to significantly improve the performance of learned index structures. The primary challenges in developing G-Learned Index lie in the use of thousands of GPU cores including minimization of synchronization and branch divergence, data structure design for parallel operations, and usage of memory bandwidth including limited memory transactions and multi-memory hierarchy. To overcome these challenges, a series of novel technologies are developed, including efficient thread organization, succinct data structures, and heterogeneous memory hierarchy utilization. Compared to the state-of-the-art learned index, the proposed G-Learned Index achieves an average of 174× speedup (and 107× of its parallel version). Meanwhile, we attain 2× less query time over the state-of-the-art GPU B-Tree. Our further exploration of range queries shows that G-Learned Index is $17\times$ faster than CPU multi-dimensional learned index.

中文翻译:

G-Learned Index:在 GPU 上启用高效学习索引

AI和GPU技术已广泛应用于解决大数据问题。 2022年全球数据总量将达到200ZB,如何在海量数据中高效地索引所需内容变得非常重要。最近,人们提出了一种有前途的学习索引来解决这一挑战:它具有极高的效率,同时保留边际空间开销。然而我们注意到,之前的学习指标主要集中在CPU架构上,而忽略了GPU的优势。由于 B-Tree、LSM 和位图等传统索引极大地受益于 GPU 加速,因此学习索引和 GPU 的组合具有实现巨大加速的巨大潜力。在本文中,我们提出了一种基于 GPU 的学习索引,称为 G-Learned Index,以显着提高学习索引结构的性能。开发 G-Learned Index 的主要挑战在于使用数千个 GPU 核心,包括最小化同步和分支发散、并行操作的数据结构设计以及内存带宽的使用,包括有限的内存事务和多内存层次结构。为了克服这些挑战,开发了一系列新技术,包括高效的线程组织、简洁的数据结构和异构内存层次结构利用。与最先进的学习索引相比,所提出的 G-Learned Index 实现了平均 174 倍的加速(及其并行版本的 107 倍)。同时,与最先进的 GPU B 树相比,我们的查询时间减少了 2 倍。我们对范围查询的进一步探索表明,G-Learned Index 是$17\次$比 CPU 多维学习索引更快。
更新日期:2024-04-02
down
wechat
bug