research-article

SILK+ Preventing Latency Spikes in Log-Structured Merge Key-Value Stores Running Heterogeneous Workloads

Authors:
Oana Balmau

University of Sydney, NSW, Australia

University of Sydney, NSW, Australia
View Profile

,
Florin Dinu

University of Sydney, NSW, Australia

University of Sydney, NSW, Australia
View Profile

,
Willy Zwaenepoel

University of Sydney, NSW, Australia

University of Sydney, NSW, Australia
View Profile

,
Karan Gupta

Nutanix Inc., CA, United States

Nutanix Inc., CA, United States
View Profile

,
Ravishankar Chandhiramoorthi

Nutanix Inc., CA, United States

Nutanix Inc., CA, United States
View Profile

,
Diego Didona

IBM Research, Zurich

IBM Research, Zurich
View Profile

Authors Info & Claims

ACM Transactions on Computer Systems Volume 36 Issue 4Article No.: 12pp 1–27https://doi.org/10.1145/3380905

Published:30 May 2020Publication History

ACM Transactions on Computer Systems

Abstract

Log-Structured Merge Key-Value stores (LSM KVs) are designed to offer good write performance, by capturing client writes in memory, and only later flushing them to storage. Writes are later compacted into a tree-like data structure on disk to improve read performance and to reduce storage space use. It has been widely documented that compactions severely hamper throughput. Various optimizations have successfully dealt with this problem. These techniques include, among others, rate-limiting flushes and compactions, selecting among compactions for maximum effect, and limiting compactions to the highest level by so-called fragmented LSMs.

In this article, we focus on latencies rather than throughput. We first document the fact that LSM KVs exhibit high tail latencies. The techniques that have been proposed for optimizing throughput do not address this issue, and, in fact, in some cases, exacerbate it. The root cause of these high tail latencies is interference between client writes, flushes, and compactions. Another major cause for tail latency is the heterogeneous nature of the workloads in terms of operation mix and item sizes whereby a few more computationally heavy requests slow down the vast majority of smaller requests.

We introduce the notion of an Input/Output (I/O) bandwidth scheduler for an LSM-based KV store to reduce tail latency caused by interference of flushing and compactions and by workload heterogeneity. We explore three techniques as part of this I/O scheduler: (1) opportunistically allocating more bandwidth to internal operations during periods of low load, (2) prioritizing flushes and compactions at the lower levels of the tree, and (3) separating client requests by size and by data access path. SILK+ is a new open-source LSM KV that incorporates this notion of an I/O scheduler.

References

Muhammad Yousuf Ahmad and Bettina Kemme. 2015. Compaction management in distributed key-value datastores. In Proceedings of VLDB.Google ScholarDigital Library
Jung-Sang Ahn, Chiyoung Seo, Ravi Mayuram, Rahim Yaseen, Jin-Soo Kim, and Seungryoul Maeng. 2015. ForestDB: A fast key-value storage system for variable-length string keys. IEEE Trans. Comput. 65, 3 (2015), 902–915.Google ScholarDigital Library
Ali Anwar, Yue Cheng, Hai Huang, Jingoo Han, Hyogi Sim, Dongyoon Lee, Fred Douglis, and Ali R. Butt. 2018. BespoKV: Application tailored scale-out key-value stores. In Proceedings of SC18.Google Scholar
Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of ACM SIGMETRICS.Google ScholarDigital Library
Oana Balmau, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan Gupta, and Pavan Konka. 2017. TRIAD: Creating synergies between memory, disk and log in log structured key-value stores. In Proceedings of USENIX ATC.Google Scholar
Oana Balmau, Florin Dinu, Willy Zwaenepoel, Karan Gupta, Ravishankar Chandhiramoorthi, and Diego Didona. 2019. SILK: Preventing latency spikes in log-structured merge key-value stores. In Proceedings of USENIX ATC.Google Scholar
Oana Balmau, Rachid Guerraoui, Vasileios Trigonakis, and Igor Zablotchi. 2017. FloDB: Unlocking memory in persistent key-value stores. In Proceedings of EuroSys.Google ScholarDigital Library
Nikhil Bansal and Mor Harchol-Balter. 2001. Analysis of SRPT scheduling: Investigating unfairness. In Proceedings ACM SIGMETRICS.Google ScholarDigital Library
Michael A. Bender, Martin Farach-Colton, William Jannen, Rob Johnson, Bradley C. Kuszmaul, Donald E. Porter, Jun Yuan, and Yang Zhan. 2015. An introduction to B-trees and Write-optimization. ;login: 40, 5 (2015).Google Scholar
Burton H. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (1970), 422–426.Google ScholarDigital Library
Michaela Blott, Ling Liu, Kimon Karras, and Kees Vissers. 2015. Scaling out to a single-node 80 gbps memcached server with 40 terabytes of memory. In Proceedings of USENIX HotStorage.Google Scholar
Edward Bortnikov, Anastasia Braginsky, Eshcar Hillel, Idit Keidar, and Gali Sheffi. 2018. Accordion: Better memory organization for LSM key-value stores. In Proceedings of VLDB.Google ScholarDigital Library
Gerth Stolting Brodal and Rolf Fagerberg. 2003. Lower bounds for external memory dictionaries. In Proceedings of SODA.Google Scholar
Helen H. W. Chan, Yongkun Li, Patrick P. C. Lee, and Yinlong Xu. 2018. HashKV: Enabling efficient updates in KV storage via hashing. In Proceedings of USENIX ATC.Google Scholar
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of SoCC.Google ScholarDigital Library
Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal navigable key-value store. In Proceedings of SIGMOD.Google ScholarDigital Library
Niv Dayan and Stratos Idreos. 2018. Dostoevsky: Better space-time trade-offs for lsm-tree based key-value stores via adaptive removal of superfluous merging. In Proceedings of SIGMOD.Google ScholarDigital Library
Jeffrey Dean and Sanjay Ghemawat. [n.d.]. LevelDB. Retrieved January 2019 from https://github.com/google/leveldb.Google Scholar
Pamela Delgado, Diego Didona, Florin Dinu, and Willy Zwaenepoel. 2016. Job-aware scheduling in eagle: Divide and stick to your probes. In Proceedings of SoCC.Google ScholarDigital Library
Pamela Delgado, Diego Didona, Florin Dinu, and Willy Zwaenepoel. 2018. Kairos: Preemptive data center scheduling without runtime estimates. In Proceedings of SoCC.Google ScholarDigital Library
Pamela Delgado, Florin Dinu, Anne-Marie Kermarrec, and Willy Zwaenepoel. 2015. Hawk: Hybrid datacenter scheduling. In Proceedings of USENIX ATC.Google Scholar
Christina Delimitrou and Christos Kozyrakis. 2018. Amdahl’s law for tail latency. Commun. ACM 61, 8 (2018), 65–72.Google ScholarDigital Library
Diego Didona and Willy Zwaenepoel. 2019. Size-aware sharding for improving tail latencies in in-memory key-value stores. In Proceedings of NSDI.Google Scholar
Siying Dong, Mark Callaghan, Leonidas Galanis, Dhruba Borthakur, Tony Savor, and Michael Strum. 2017. Optimizing space amplification in rocksDB. In Proceedings of CIDR.Google Scholar
Assaf Eisenman, Asaf Cidon, Evgenya Pergament, Or Haimovich, Ryan Stutsman, Mohammad Alizadeh, and Sachin Katti. 2019. Flashield: A hybrid key-value cache that controls flash write amplification. In Proceedings of NSDI.Google Scholar
Facebook. [n.d.]. RocksDB: A Persistent Key-value Store for Fast Storage Environments. Retrieved January 2019 from https://rocksdb.org.Google Scholar
Facebook. [n.d.]. RocksDB Autotuned Rate Limiter. Retrieved January 2019 from https://rocksdb.org/blog/2017/12/18/17-auto-tuned-rate-limiter.html.Google Scholar
Facebook. [n.d.]. RocksDB Benchmarking Tools. Retrieved January 2019 from https://github.com/facebook/rocksdb/wiki/Benchmarking-tools.Google Scholar
Facebook. [n.d.]. RocksDB Level-based Compaction Changes. Retrieved January 2019 from https://rocksdb.org/blog/2017/06/26/17-level-based-changes.html.Google Scholar
Facebook. [n.d.]. RocksDB Rate Limiter. Retrieved January 2019 from https://github.com/facebook/rocksdb/wiki/Rate-Limiter.Google Scholar
Facebook. [n.d.]. RocksDB Tuing Guide. Retrieved January 2019 from https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide.Google Scholar
Guy Golan-Gueta, Edward Bortnikov, Eshcar Hillel, and Idit Keidar. 2015. Scaling concurrent log-structured data stores. In Proceedings of EuroSys.Google ScholarDigital Library
Mor Harchol-Balter. 2013. Performance Modeling and Design of Computer Systems: Queueing Theory in Action. Cambridge University Press.Google ScholarDigital Library
Yu Hua, Bin Xiao, Bharadwaj Veeravalli, and Dan Feng. 2011. Locality-sensitive bloom filter for approximate membership query. IEEE Trans. Comput. 61, 6 (2011), 817–830.Google ScholarDigital Library
Hyperdex. [n.d.]. HyperLevelDB. Retrieved January 2019 from https://github.com/rescrv/HyperLevelDB.Google Scholar
William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet, Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif Walsh, Michael Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter. 2015. BetrFS: Write-optimization in a kernel file system. ACM Trans. Storage (TOS) 11, 4 (2015), 1–29.Google ScholarDigital Library
Kostis Kaffes, Timothy Chong, Jack Tigar Humphries, Adam Belay, David Mazieres, and Christos Kozyrakis. 2019. Shinjuku: Preemptive scheduling for second-scale tail latency. In Proceedings of NSDI.Google Scholar
Sudarsun Kannan, Nitish Bhat, Ada Gavrilovska, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2018. Redesigning LSMs for nonvolatile memory with noveLSM. In Proceedings of USENIX ATC.Google Scholar
Leonard Kleinrock. 1975. Theory, volume 1, queueing systems. (1975).Google ScholarDigital Library
Chunbo Lai, Song Jiang, Liqiong Yang, Shiding Lin, Guangyu Sun, Zhenyu Hou, Can Cui, and Jason Cong. 2015. Atlas: Baidu’s key-value storage system for cloud data. In Proceedings of MSST.Google ScholarCross Ref
Avinash Lakshman and Prashant Malik. 2010. Cassandra: A decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44, 2 (2010), 35–40.Google ScholarDigital Library
Baptiste Lepers, Oana Balmau, Karan Gupta, and Willy Zwaenepoel. 2019. KVell: the design and implementation of a fast persistent key-value store. In Proceedings of SOSP.Google ScholarDigital Library
Hyeontaek Lim, David G. Andersen, and Michael Kaminsky. 2016. Towards accurate and fast evaluation of multi-stage log-structured designs. In Proceedings of FAST.Google Scholar
Kevin Lim, David Meisner, Ali G. Saidi, Parthasarathy Ranganathan, and Thomas F. Wenisch. 2013. Thin servers with smart pipes: Designing SoC accelerators for memcached. In Proceedings of ACM SIGARCH.Google Scholar
Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating keys from values in SSD-conscious storage. In Proceedings of FAST.Google Scholar
Fei Mei, Qiang Cao, Hong Jiang, and Jingjun Li. 2018. SifrDB: A unified solution for write-optimized key-value stores in large datacenter. In Proceedings of SoCC.Google ScholarDigital Library
Fei Mei, Qiang Cao, Hong Jiang, and Lei Tian Tintri. 2017. LSM-tree managed storage for large-scale key-value store. In Proceedings of SoCC.Google ScholarDigital Library
Memcached. [n.d.]. memcached: Free 8 open source, high-performance, distributed memory object caching system. Retrieved May 2019 from https://memcached.org/.Google Scholar
Alexander Merritt, Ada Gavrilovska, Yuan Chen, and Dejan Milojicic. 2017. Concurrent log-structured memory for many-core key-value stores. In Proceedings of VLDB.Google ScholarDigital Library
Patrick O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O’Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Inf. 33, 4 (1996), 351–385.Google ScholarDigital Library
John Ousterhout and Fred Douglis. 1989. Beating the I/O bottleneck: A case for log-structured file systems. ACM SIGOPS Operating Systems Review 23, 1 (1989), 11–28.Google ScholarDigital Library
Anastasios Papagiannis, Giorgos Saloustros, Pilar González-Férez, and Angelos Bilas. 2016. Tucana: Design and implementation of a fast and efficient scale-up key-value store. In Proceedings of USENIX ATC.Google Scholar
Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. PebblesDB: Building key-value stores using fragmented log-structured merge trees. In Proceedings of SOSP.Google ScholarDigital Library
Waleed Reda, Marco Canini, Lalith Suresh, Dejan Kostić, and Sean Braithwaite. 2017. Rein: Taming tail latency in key-value stores via multiget scheduling. In Proceedings of EuroSys.Google ScholarDigital Library
Kai Ren, Qing Zheng, Joy Arulraj, and Garth Gibson. 2017. SlimDB: A space-efficient key-value storage engine for semi-sorted data. In Proceedings of VLDB.Google ScholarDigital Library
Russell Sears and Raghu Ramakrishnan. 2012. bLSM: A general purpose log structured merge tree. In Proceedings of SIGMOD.Google ScholarDigital Library
Peng Wang, Guangyu Sun, Song Jiang, Jian Ouyang, Shiding Lin, Chen Zhang, and Jason Cong. 2014. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In Proceedings of EuroSys.Google ScholarDigital Library
Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data. In Proceedings of USENIX ATC.Google Scholar
Ting Yao, Jiguang Wan, Ping Huang, Xubin He, Fei Wu, and Changsheng Xie. 2017. Building efficient key-value stores via a lightweight compaction tree. ACM Trans. Storage (TOS) 13, 4 (2017), 1–28.Google ScholarDigital Library
Qi Zhang, Alma Riska, Wei Sun, Evgenia Smirni, and Gianfranco Ciardo. 2005. Workload-aware load balancing for clustered web servers. IEEE Trans. Parallel Distrib. Syst. 16, 3 (2005), 219–233.Google ScholarDigital Library

Index Terms

SILK+ Preventing Latency Spikes in Log-Structured Merge Key-Value Stores Running Heterogeneous Workloads
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Parallel and distributed DBMSs
        Key-value stores
  2. Information storage systems
    1. Storage management
2. Software and its engineering
  1. Software organization and properties
    1. Extra-functional properties
      1. Software performance

Recommendations

Splaying Log-Structured Merge-Trees
SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data

Modern persistent key-value stores typically use a log-structured merge-tree (LSM-tree) design, which allows for high write throughput. Our observation is that the LSM-tree, however, has suboptimal performance during read-intensive workload windows with ...
Read More
Design of LSM-tree-based Key-value SSDs with Bounded Tails
Key-value store based on a log-structured merge-tree (LSM-tree) is preferable to hash-based key-value store, because an LSM-tree can support a wider variety of operations and show better performance, especially for writes. However, LSM-tree is difficult ...
Read More
Building Efficient Key-Value Stores via a Lightweight Compaction Tree
Special Issue on MSST 2017 and Regular Papers

Log-Structure Merge tree (LSM-tree) has been one of the mainstream indexes in key-value systems supporting a variety of write-intensive Internet applications in today’s data centers. However, the performance of LSM-tree is seriously hampered by ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Computer Systems Volume 36, Issue 4
Section: Best of ATC 2019 and Regular Paper
November 2018
115 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/3394910
Editor:
Michael Swift
University of Wisconsin, USA
Issue’s Table of Contents
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 May 2020
- Online AM: 7 May 2020
- Revised: 1 January 2020
- Accepted: 1 January 2020
- Received: 1 October 2019
Published in tocs Volume 36, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
I/O scheduling
log-structured merge key-value stores
tail latency
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 12
  Total Citations
  View Citations
- 734
  Total Downloads
- Downloads (Last 12 months)65
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

SILK+ Preventing Latency Spikes in Log-Structured Merge Key-Value Stores Running Heterogeneous Workloads

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

Splaying Log-Structured Merge-Trees

Design of LSM-tree-based Key-value SSDs with Bounded Tails

Building Efficient Key-Value Stores via a Lightweight Compaction Tree

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

SILK+ Preventing Latency Spikes in Log-Structured Merge Key-Value Stores Running Heterogeneous Workloads

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

Splaying Log-Structured Merge-Trees

Design of LSM-tree-based Key-value SSDs with Bounded Tails

Building Efficient Key-Value Stores via a Lightweight Compaction Tree

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media