当前位置: X-MOL 学术VLDB J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Morphtree: a polymorphic main-memory learned index for dynamic workloads
The VLDB Journal ( IF 4.2 ) Pub Date : 2023-12-01 , DOI: 10.1007/s00778-023-00823-y
Yongping Luo , Peiquan Jin , Zhaole Chu , Xiaoliang Wang , Yigui Yuan , Zhou Zhang , Yun Luo , Xufei Wu , Peng Zou

Modern database systems rely on indexes to accelerate data access. The recently proposed learned indexes can offer higher search performance with lower space costs than traditional indexes like B+-tree. We observe that existing main-memory learned indexes are particularly optimized for read-heavy workloads. However, such an optimization comes at the cost of model training and handling out-of-range key insertions, which will worsen the overall performance. We argue that workloads are not always read-heavy in real applications, and it is more important and practical to make learned indexes work efficiently for dynamic workloads with changing access patterns and data distributions. In this paper, we aim to improve the practicality of learned indexes by making them adaptive to dynamic workloads. Specifically, we propose a new polymorphic learned index named Morphtree, which can adaptively change the index structure to provide stable and high performance for dynamic workloads. The novelty of Morphtree lies in three aspects: (1) a decoupled tree structure for separating the inner search tree from the data layer consisting of leaf nodes, (2) a read-optimized learned inner tree for improving the performance of index search, and (3) an evolving data layer for automatically transforming node layouts into read friendly or write friendly according to workload changes. We evaluate these new ideas of Morphtree on various datasets and workloads. The comparative results with six up-to-date learned indexes, including ALEX, PGM-index, FITing-tree, LIPP, FINEdex, and XIndex, show that Morphtree can achieve, on average, 0.56x and 3x improvements in lookup and insertion performance, respectively. Moreover, when evaluated on dynamic workloads with changing lookup ratios and data distributions, Morphtree can achieve a sustained high throughput across different real-world datasets and query patterns, owing to its ability to automatically adjust the index structure according to workload changes.



中文翻译:

Morphtree:动态工作负载的多态主内存学习索引

现代数据库系统依靠索引来加速数据访问。最近提出的学习索引可以比 B+ 树等传统索引提供更高的搜索性能和更低的空间成本。我们观察到现有的主存学习索引特别针对读取繁重的工作负载进行了优化。然而,这种优化是以模型训练和处理超出范围的密钥插入为代价的,这会恶化整体性能。我们认为,在实际应用程序中,工作负载并不总是读取繁重,更重要和实用的是,使学习索引能够有效地处理具有不断变化的访问模式和数据分布的动态工作负载。在本文中,我们的目标是通过使学习索引适应动态工作负载来提高学习索引的实用性。具体来说,我们提出了一种名为Morphtree的新多态学习索引,它可以自适应地改变索引结构,为动态工作负载提供稳定和高性能。 Morphtree的新颖之处在于三个方面:(1)解耦的树结构,用于将内部搜索树与由叶节点组成的数据层分离;(2)读取优化的学习内部树,用于提高索引搜索的性能;以及(3)不断发展的数据层,用于根据工作负载变化自动将节点布局转换为读友好或写友好。我们在各种数据集和工作负载上评估 Morphtree 的这些新想法。与 ALEX、PGM-index、FITing-tree、LIPP、FINEdex 和 XIndex 等 6 个最新学习索引的比较结果表明,Morphtree 在查找和插入性能方面平均可以实现 0.56 倍和 3 倍的提升, 分别。此外,当对具有变化的查找比率和数据分布的动态工作负载进行评估时,Morphtree 可以在不同的现实数据集和查询模式中实现持续的高吞吐量,因为它能够根据工作负载的变化自动调整索引结构。

更新日期:2023-12-01
down
wechat
bug