Abstract
Although multi-core processors enhance the performance yet the challenge of estimating Worst-Case Execution Time (WCET) of a task remains in such systems due to interference in shared resources like Last Level Caches (LLC). Cache partitioning has been used to reduce the interference problem by isolating the shared cache among each thread to ease the WCET estimation. However, it prevents information shared among parallel threads running in different cores. In current work, we propose sharing and reuse aware partitioned cache (SRCP) framework such that replication of shared information, data, or instruction, in different partitions could be avoided in LLC. Further, enhancement in existing cache replacement policy is proposed, which avoids eviction of cache blocks shared among multiple cores accessing partitioned last level cache. Tighter WCET, as well as improved resource utilization, is thereby ensured with the proposed framework. Experimental results show that SRCP shows significant improvement in cache hit-rate for PARSEC and SPLASH2 benchmarks as compared to least recently used cache replacement policy and outperforms EHC and TA-DRRIP, which are state-of-the-art replacement policies.
Similar content being viewed by others
References
Agarwal A, Li H, Roy K (2002) Drg-cache: a data retention gated-ground cache for low power. In: Proceedings of the 39th annual design automation conference, pp 473–478. ACM
Altmeyer S, Douma R, Lunniss W, Davis RI (2016) On the effectiveness of cache partitioning in hard real-time systems. Real-Time Syst 52(5):598–643
Barrow-Williams N, Fensch C, Moore S (2009) A communication characterisation of splash-2 and parsec. In: 2009 IEEE international symposium on workload characterization (IISWC), pp 86–97. IEEE
Bienia C, Kumar S, Singh JP, Li K (2008) The parsec benchmark suite: characterization and architectural implications. In: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp 72–81. ACM
Binkert N, Beckmann B, Black G, Reinhardt SK, Saidi A, Basu A, Hestness J, Hower DR, Krishna T, Sardashti S et al (2011) The gem5 simulator. ACM SIGARCH Comput Archit News 39(2):1–7
Chaudhuri M (2009) Pseudo-lifo: the foundation of a new family of replacement policies for last-level caches. In: Proceedings of the 42nd annual IEEE/ACM international symposium on microarchitecture, pp 401–412. ACM
El-Sayed N, Mukkara A, Tsai P-A, Kasture H, Ma X, Sanchez D (2018) Kpart: a hybrid cache partitioning-sharing technique for commodity multicores. In: 2018 IEEE international symposium on high performance computer architecture (HPCA), pp 104–117. IEEE
Guan N, Lv M, Yi W, Yu G (2014) Wcet analysis with MRU cache: challenging LRU for predictability. ACM Trans Embedded Comput Syst 13(4s):123
Guo Z, Yang K, Yao F, Awad A (2020) Inter-task cache interference aware partitioned real-time scheduling. In: Proceedings of the 35th annual ACM symposium on applied computing, pp 218–226
Guthaus MR, Ringenberg JS, Ernst D, Austin TM, Mudge T, Brown RB (2001) Mibench: a free, commercially representative embedded benchmark suite. In: Proceedings of the fourth annual IEEE international workshop on workload characterization. WWC-4 (Cat. No. 01EX538), pp. 3–14. IEEE
Jaleel A, Theobald KB, Steely Jr SC, Emer J (2010) High performance cache replacement using re-reference interval prediction (rrip). In: ACM SIGARCH computer architecture news, vol 38, pp 60–71. ACM
Kedar G, Mendelson A, Cidon I (2017) Space: semi-partitioned cache for energy efficient, hard real-time systems. IEEE Trans Comput 66(4):717–730
Lai A-C, Fide C, Falsafi B (2001) Dead-block prediction and dead-block correlating prefetchers. In: Proceedings 28th annual international symposium on computer architecture, pp 144–154. IEEE
Lee B, Kim K, Chung E-Y (2018) Replacement policy adaptable miss curve estimation for efficient cache partitioning. IEEE Trans Comput-Aided Des Integr Circuits Syst 37(2):445–457
Li S, Ahn JH, Strong RD, Brockman JB, Tullsen DM, Jouppi NP (2013) The mcpat framework for multicore and manycore architectures: simultaneously modeling power, area, and timing. ACM Trans Architect Code Optimiz 10(1):5
Liu T, Zhao Y, Li M, Xue CJ (2010) Task assignment with cache partitioning and locking for wcet minimization on mpsoc. In: 2010 39th International Conference on Parallel Processing, pp 573–582. IEEE
Mittal S (2017) A survey of techniques for cache partitioning in multicore processors. ACM Comput Surv 50(2):27
Muralimanohar N, Balasubramonian R, Jouppi NP (2009) Cacti 6.0: a tool to model large caches. HP Lab 27:28
Natarajan R, Chaudhuri M (2013) Characterizing multi-threaded applications for designing sharing-aware last-level cache replacement policies. In: 2013 IEEE international symposium on workload characterization (IISWC), pp 1–10. IEEE
Panda B, Balachandran S (2012) Csharp: coherence and sharing aware replacement policies for parallel applications. In: Proceedings of 24th IEEE international conference on computer architecture and high performance computing. IEEE
Qureshi MK, Jaleel A, Patt YN, Steely SC, Emer J (2007) Adaptive insertion policies for high performance caching. ACM SIGARCH Comput Archit News 35(2):381–391
Ravindran R, Chu M, Mahlke S (2007) Compiler-managed partitioned data caches for low power. ACM SIGPLAN Notices 42(7):237–247
Suhendra V, Mitra T (2008) Exploring locking & partitioning for predictable shared caches on multi-cores. In: 2008 45th ACM/IEEE design automation conference, pp 300–303. IEEE
Sundararajan KT, Jones TM, Topham NP (2013) Recap: region-aware cache partitioning. In: 2013 IEEE 31st international conference on computer design (ICCD), pp 294–301. IEEE
Vakil-Ghahani A, Mahdizadeh-Shahri S, Lotfi-Namin M-R, Bakhshalipour M, Lotfi-Kamran P, Sarbazi-Azad H (2018) Cache replacement policy based on expected hit count. IEEE Comput Architect Lett 17(1):64–67
Wang W, Mishra P, Ranka S (2011) Dynamic cache reconfiguration and partitioning for energy optimization in real-time multi-core systems. In: 2011 48th ACM/EDAC/IEEE design automation conference (DAC), pp 948–953. IEEE
Wilhelm R, Engblom J, Ermedahl A, Holsti N, Thesing S, Whalley D, Bernat G, Ferdinand C, Heckmann R, Mitra T, Mueller F, Puaut I, Puschner P, Staschulat J, Stenstrom P (2008) The worst-case execution-time problem-overview of methods and survey of tools. ACM Trans Embed Comput Syst 7(3):36
Woo SC, Ohara M, Torrie E, Singh JP, Gupta A (1995) The splash-2 programs: characterization and methodological considerations. ACM SIGARCH Comput Architect News 23(2):24–36
Zhang C, Vahid F, Najjar W, Najjar W (2005) A highly configurable cache for low energy embedded systems. ACM Trans Embed Comput Syst 4(2):363–387
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ghosh, S.N., Bhargava, L. & Sahula, V. SRCP: sharing and reuse-aware replacement policy for the partitioned cache in multicore systems. Des Autom Embed Syst 25, 193–211 (2021). https://doi.org/10.1007/s10617-021-09251-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10617-021-09251-z