skip to main content
research-article
Free Access
Just Accepted

iSwap: A New Memory Page Swap Mechanism for Reducing Ineffective I/O Operations in Cloud Environments

Online AM:23 March 2024Publication History
Skip Abstract Section

Abstract

This paper proposes iSwap, a new memory page swap mechanism that reduces the ineffective I/O swap operations and improves the QoS for applications with a high priority in the cloud environments. iSwap works in the OS kernel. iSwap accurately learns the reuse patterns for memory pages and makes the swap decisions accordingly to avoid ineffective operations. In the cases where memory pressure is high, iSwap compresses pages that belong to the latency-critical (LC) applications (or high-priority applications) and keeps them in main memory, avoiding I/O operations for these LC applications to ensure QoS; and iSwap evicts low-priority applications’ pages out of main memory. iSwap has a low overhead and works well for cloud applications with large memory footprints. We evaluate iSwap on Intel x86 and ARM platforms. The experimental results show that iSwap can significantly reduce ineffective swap operations (8.0% - 19.2%) and improve the QoS for LC applications (36.8% - 91.3%) in cases where memory pressure is high, compared with the latest LRU-based approach widely used in modern OSes.

References

  1. “ Cleancache and frontswap,” https://lwn.net/Articles/386090/.Google ScholarGoogle Scholar
  2. “ The crypto compression api,” https://docs.kernel.org/crypto.Google ScholarGoogle Scholar
  3. “ The FreeBSD project,” https://www.freebsd.org.Google ScholarGoogle Scholar
  4. “ The linux kernel archives,” https://www.kernel.org/.Google ScholarGoogle Scholar
  5. “Page table management,” https://www.kernel.org/doc/gorman/html/understand/understand006.html.Google ScholarGoogle Scholar
  6. C. Bienia, S. Kumar, J. P. Singh, and K. Li, “The parsec benchmark suite: Characterization and architectural implications,” in PACT, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. “ The /proc filesystem,” https://docs.kernel.org/filesystem/proc.html.Google ScholarGoogle Scholar
  8. “ Tunable watermark,” https://lwn.net/Articles/422291/.Google ScholarGoogle Scholar
  9. “ The zswap compressed swap cache,” https://lwn.net/Articles/537422/.Google ScholarGoogle Scholar
  10. D. Ardelean, A. Diwan, and C. Erdman, “Performance analysis of cloud applications,” in NSDI, 2018.Google ScholarGoogle Scholar
  11. S. Bai, H. Wan, Y. Huang, X. Sun, F. Wu, C. Xie, H.-C. Hsieh, T.-W. Kuo, and C. J. Xue, “Pipette: Efficient fine-grained reads for SSDs,” in DAC, 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Banerjee, Parallel algorithms for VLSI computer-aided design. Prentice-Hall, Inc., 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Bergman, N. Cassel, M. Bjorling, and M. Silberstein, “ZNSwap: un-Block your swap,” in USENIX ATC, 2022.Google ScholarGoogle Scholar
  14. S. Chen, C. Delimitrou, and J. F. Martínez, “Parties: Qos-aware resource partitioning for multiple interactive services,” in ASPLOS, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Cooper, “ YCSB: Yahoo! cloud serving benchmark.”Google ScholarGoogle Scholar
  16. B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, “Benchmarking cloud serving systems with ycsb,” in SoCC, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. B. Fitzpatrick, “Distributed caching with memcached,” Linux journal, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. Grahne and J. Zhu, “Efficiently using prefix-trees in mining frequent itemsets.” in FIMI, 2003.Google ScholarGoogle Scholar
  19. J. Han, E. Haihong, G. Le, and J. Du, “Survey on nosql database,” in PERCOM, 2011.Google ScholarGoogle Scholar
  20. S. Jiang, F. Chen, and X. Zhang, “CLOCK-Pro: An effective improvement of the CLOCK replacement,” in USENIX ATC, 2005.Google ScholarGoogle Scholar
  21. S. Kim and J.-S. Yang, “Optimized I/O determinism for emerging NVM-based NVMe SSD in an enterprise system,” in DAC, 2018.Google ScholarGoogle Scholar
  22. C. Kurumada, S. C. Meylan, and M. C. Frank, “Zipfian frequency distributions facilitate word segmentation in context,” Cognition, 2013.Google ScholarGoogle Scholar
  23. N. Lebeck, A. Krishnamurthy, H. M. Levy, and I. Zhang, “End the senseless killing: Improving memory management for mobile operating systems,” in USENIX ATC, 2020.Google ScholarGoogle Scholar
  24. L. Liu, et al, “Intelligent resource scheduling for co-located latency-critical services: A multi-model collaborative learning approach,” in USENIX FAST, 2023.Google ScholarGoogle Scholar
  25. L. Liu, et al, “Rethinking memory management in modern operating system: Horizontal, vertical or random?” in IEEE TC, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Liu, et al, “Hierarchical hybrid memory management in OS for tiered memory systems,” in IEEE TPDS, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  27. A. Maruf, A. Ghosh, J. Bhimani, D. Campello, A. Rudoff, and R. Rangaswami, “Multi-clock: Dynamic tiering for hybrid memory systems,” in HPCA, 2022.Google ScholarGoogle ScholarCross RefCross Ref
  28. M. Müller, D. Charypar, and M. H. Gross, “Particle-based fluid simulation for interactive applications,” in SCA, 2003.Google ScholarGoogle Scholar
  29. A. Ousterhout, J. Fried, J. Behrens, A. Belay, and H. Balakrishnan, “Shenango: Achieving high cpu efficiency for latency-sensitive datacenter workloads,” in NSDI, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. L. O’Callaghan, N. Mishra, A. Meyerson, S. Guha, and R. Motwani, “High-performance clustering of streams and large data sets,” in ICDE, 2002.Google ScholarGoogle Scholar
  31. J. Zhu, et al, “CFIO: A conflict-free I/O mechanism to fully exploit internal parallelism for Open-Channel SSDs,” in Journal of Systems Architecture, 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Park, M. Kim, M. Chun, L. Orosa, J. Kim, and O. Mutlu, “Reducing solid-state drive read latency by optimizing read-retry,” in ASPLOS, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. K. Tanaka, “Monitoring virtual memory with vmstat,” in Linux Journal, 2005.Google ScholarGoogle Scholar
  34. A. S. Tenenbaum, Operating Systems: Design and Implementation. Prentice-Hall, 1987.Google ScholarGoogle Scholar
  35. X. Xiang, C. Ding, H. Luo, and B. Bao, “HOTL: A higher order theory of locality,” in ASPLOS, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Yang, Y. Wang, and Z. Wang, “Efficient modeling of random sampling-based LRU,” in ICPP, 2021.Google ScholarGoogle Scholar
  37. X. Zhang, S. Dwarkadas, and K. Shen, “Towards practical page coloring-based multicore cache management,” in EuroSys, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. T. Patel and D. Tiwari, “Clite: Efficient and qos-aware co-location of multiple latency-critical jobs for warehouse scale computers,” in HPCA, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  39. “ MySQL Database,” https://www.mysql.com.Google ScholarGoogle Scholar
  40. J. L. Hennessy and D. A. Patterson, Computer architecture: a quantitative approach. Elsevier, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. T. Anderson and M. Dahlin, Operating Systems: Principles and Practice. Recursive books, 2014.Google ScholarGoogle Scholar
  42. J. H. Saltzer and M. F. Kaashoek, Principles of computer system design: an introduction. Morgan Kaufmann, 2009.Google ScholarGoogle Scholar
  43. Park, SeongJae, Yunjae Lee, and Heon Y. Yeom. “Profiling dynamic data access patterns with controlled overhead and quality,” in Proceedings of the 20th International Middleware Conference Industrial Track, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. A. Lagar-Cavilla, J. Ahn, S. Souhlal, N. Agarwal, R. Burny, S. Butt, J. Chang, A. Chaugule, N. Deng, J. Shahid et al., “Software-defined far memory in warehouse-scale computers,” in ASPLOS, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. J. Weiner, N. Agarwal, D. Schatzberg, L. Yang, H. Wang, B. Sanouillet, B. Sharma, T. Heo, M. Jain, C. Tang et al., “TMO: transparent memory offloading in datacenters,” in ASPLOS, 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. “ Idle page tracking/working set estimation.,” https://lwn.net/Articles/460762/.Google ScholarGoogle Scholar
  47. P. Zhou, V. Pandey, J. Sundaresan, A. Raghuraman, Y. Zhou, and S. Kumar, “Dynamic tracking of page miss ratio curve for memory management,” in ACM SIGPLAN Notices, 2004.Google ScholarGoogle Scholar
  48. “ zram: Compressed RAM based block devices.,” https://www.kernel.org/doc/Documentation/blockdev/zram.txt.Google ScholarGoogle Scholar
  49. “ zcache: a compressed file page cache.,” https://lwn.net/Articles/562254/.Google ScholarGoogle Scholar
  50. “ Idle and stale page tracking.,” https://lwn.net/Articles/461461/.Google ScholarGoogle Scholar
  51. L. Liu, C. Wu, and X. Feng. “Memory resource optimization method and apparatus,” US Patent No. 9,857,980, 2018.Google ScholarGoogle Scholar
  52. F. Lv, et al, “Dynamic I/O-Aware Scheduling for Batch-Mode Applications on Chip Multiprocessor Systems of Cluster Platforms,” in Journal of Computer Science and Technology,, 2014.Google ScholarGoogle Scholar

Index Terms

  1. iSwap: A New Memory Page Swap Mechanism for Reducing Ineffective I/O Operations in Cloud Environments

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Architecture and Code Optimization
      ACM Transactions on Architecture and Code Optimization Just Accepted
      ISSN:1544-3566
      EISSN:1544-3973
      Table of Contents

      Copyright © 2024 Copyright held by the owner/author(s).

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Online AM: 23 March 2024
      • Accepted: 30 January 2024
      • Revised: 26 December 2023
      • Received: 24 October 2023
      Published in taco Just Accepted

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)70
      • Downloads (Last 6 weeks)70

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader