skip to main content
research-article
Free Access
Just Accepted

Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program Traces

Online AM:29 February 2024Publication History
Skip Abstract Section

Abstract

Trace-based simulation is a widely used methodology for system design exploration. It relies on realistic traces that represent a range of behaviors necessary to be evaluated, containing a lot of information about the application, its inputs and the underlying system on which it was generated. Consequently, generating traces from real-world executions risk leakage of sensitive information. To prevent this, traces can be obfuscated before release. However, this can undermine their ideal utility, i.e., how realistically a program behavior was captured. To address this, we propose Camouflage, a novel obfuscation framework, designed with awareness of the necessary architectural properties required to preserve trace utility, while ensuring secrecy of the inputs used to generate the trace. Focusing on memory access traces, our extensive evaluation on various benchmarks shows that camouflaged traces preserve the performance measurements of the original execution, with an average τ correlation of 0.66. We model input secrecy as an input indistinguishability problem and show that the average security loss is 7.8%, which is better than traces generated from the state-of-the-art.

References

  1. 2016. Championship Branch Prediction (CBP-5). (2016). https://jilp.org/cbp2016/Google ScholarGoogle Scholar
  2. 2017. Cache Replacement Championship (CRC-2). (2017). https://crc2.ece.tamu.edu/Google ScholarGoogle Scholar
  3. 2017. ChampSim. (2017). https://github.com/ChampSim/Google ScholarGoogle Scholar
  4. 2018. Championship Value Prediction (CVP-1). (2018). https://www.microarch.org/cvp1/cvp1online/program.htmlGoogle ScholarGoogle Scholar
  5. 2019. 3rd Data Prefetching Championship.(2019). https://dpc3.compas.cs.stonybrook.edu/?final_programsGoogle ScholarGoogle Scholar
  6. Sam Ainsworth and Timothy M. Jones. 2016. Graph Prefetching Using Data Structure Knowledge. In Proceedings of the 2016 International Conference on Supercomputing (ICS ’16). Article 39.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Amro Awad and Yan Solihin. 2014. STM: Cloning the spatial and temporal memory access behavior. In 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA). 237–247.Google ScholarGoogle ScholarCross RefCross Ref
  8. M. Badr, C. Delconte, I. Edo, R. Jagtap, M. Andreozzi, and N. E. Jerger. 2020. Mocktails: Capturing the Memory Behaviour of Proprietary Mobile Architectures. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). 460–472.Google ScholarGoogle Scholar
  9. M. Badr and N. E. Jerger. 2014. SynFull: Synthetic traffic models capturing cache coherent behaviour. In 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA). 109–120.Google ScholarGoogle ScholarCross RefCross Ref
  10. G. Balakrishnan and Y. Solihin. 2012. WEST: Cloning data cache behavior using Stochastic Traces. In IEEE International Symposium on High-Performance Comp Architecture. 1–12.Google ScholarGoogle Scholar
  11. Kevin Barker, Thomas HBenson, Dan Campbell, David Ediger, Roberto Gioiosa, Adolfy Hoisie, Darren Kerbyson, Joseph Manzano, Andres Marquez, Leon Song, Nathan Tallent, and Antonino Tumeo. 2013. PERFECT (Power Efficiency Revolution For Embedded Computing Technologies) Benchmark Suite Manual. Pacific Northwest National Laboratory and Georgia Tech Research Institute. http://hpc.pnnl.gov/projects/PERFECT/.Google ScholarGoogle Scholar
  12. Scott Beamer, Krste Asanović, and David Patterson. 2015. The GAP Benchmark Suite. https://doi.org/10.48550/ARXIV.1508.03619Google ScholarGoogle ScholarCross RefCross Ref
  13. Mihir Bellare, Sriram Keelveedhi, and Thomas Ristenpart. 2013. Message-locked encryption and secure deduplication. In Annual international conference on the theory and applications of cryptographic techniques. Springer, 296–312.Google ScholarGoogle ScholarCross RefCross Ref
  14. James Bucek, Klaus-Dieter Lange, and Jóakim v. Kistowski. 2018. SPEC CPU2017: Next-Generation Compute Benchmark. In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering (ICPE ’18). 41–42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ileana Buhan, Lejla Batina, Yuval Yarom, and Patrick Schaumont. 2021. SoK: Design Tools for Side-Channel-Aware Implementions. arXiv preprint arXiv:2104.08593(2021).Google ScholarGoogle Scholar
  16. Dehao Chen, Neil Vachharajani, Robert Hundt, Shih-wei Liao, Vinodha Ramasamy, Paul Yuan, Wenguang Chen, and Weimin Zheng. 2010. Taming Hardware Event Samples for FDO Compilation. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (Toronto, Ontario, Canada) (CGO ’10). 42–52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Mia Xu Chen, Benjamin N. Lee, Gagan Bansal, Yuan Cao, Shuyuan Zhang, Justin Lu, Jackie Tsay, Yinan Wang, Andrew M. Dai, Zhifeng Chen, Timothy Sohn, and Yonghui Wu. 2019. Gmail Smart Compose: Real-Time Assisted Writing. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 2287–2295. https://doi.org/10.1145/3292500.3330723Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Deeksha Dangwal, Weilong Cui, Joseph McMahan, and Timothy Sherwood. 2019. Safer Program Behavior Sharing Through Trace Wringing. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Providence, RI, USA) (ASPLOS ’19). Association for Computing Machinery, New York, NY, USA, 1059–1072. https://doi.org/10.1145/3297858.3304074Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Deeksha Dangwal, Zhizhou Zhang, Jedidiah R. Crandall, and Timothy Sherwood. 2021. Context-Aware Privacy-Optimizing Address Tracing. In 2021 International Symposium on Secure and Private Execution Environment Design (SEED). 150–162. https://doi.org/10.1109/SEED51797.2021.00027Google ScholarGoogle ScholarCross RefCross Ref
  20. John Demme, Robert Martin, Adam Waksman, and Simha Sethumadhavan. 2012. Side-channel vulnerability factor: A metric for measuring information leakage. In 2012 39th Annual International Symposium on Computer Architecture (ISCA). 106–117. https://doi.org/10.1109/ISCA.2012.6237010Google ScholarGoogle ScholarCross RefCross Ref
  21. L. Eeckhout, K. de Bosschere, and H. Neefs. 2000. Performance analysis through synthetic trace generation. In 2000 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS (Cat. No.00EX422). 1–6.Google ScholarGoogle Scholar
  22. Jean-François Gallais, Ilya Kizhvatov, and Michael Tunstall. 2010. Improved trace-driven cache-collision attacks against embedded AES implementations. In International Workshop on Information Security Applications. Springer, 243–257.Google ScholarGoogle Scholar
  23. Oded Goldreich and Rafail Ostrovsky. 1996. Software Protection and Simulation on Oblivious RAMs. J. ACM 43, 3 (may 1996), 431–473. https://doi.org/10.1145/233551.233553Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Shafi Goldwasser and Silvio Micali. 1984. Probabilistic encryption. Journal of computer and system sciences 28, 2 (1984), 270–299.Google ScholarGoogle ScholarCross RefCross Ref
  25. Paul Grubbs, Kevin Sekniqi, Vincent Bindschaedler, Muhammad Naveed, and Thomas Ristenpart. 2017. Leakage-abuse attacks against order-revealing encryption. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 655–672.Google ScholarGoogle ScholarCross RefCross Ref
  26. I. Haller, A. Slowinska, and H. Bos. 2013. MemPick: High-level data structure detection in C/C++ binaries. In 2013 20th Working Conference on Reverse Engineering (WCRE). 32–41. https://doi.org/10.1109/WCRE.2013.6671278Google ScholarGoogle ScholarCross RefCross Ref
  27. Alon Itai and Michael Slavkin. 2007. Detecting Data Structures from Traces. In Proceedings of the Workshop on Approaches and Applications of Inductive Programming, AAIP’07, September 17, 2007, Warsaw, Poland, Emanuel Kitzelmann and Ute Schmid (Eds.). 39–50. https://cogsys.uni-bamberg.de/events/aaip07/aaip_print.pdf#page=47Google ScholarGoogle Scholar
  28. Aamer Jaleel, Kevin B. Theobald, Simon C. Steely, and Joel Emer. 2010. High Performance Cache Replacement Using Re-Reference Interval Prediction (RRIP). In Proceedings of the 37th Annual International Symposium on Computer Architecture (Saint-Malo, France) (ISCA ’10). Association for Computing Machinery, New York, NY, USA, 60–71. https://doi.org/10.1145/1815961.1815971Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Joshi, L. Eeckhout, R. H. Bell, and L. John. 2006. Performance Cloning: A Technique for Disseminating Proprietary Applications as Benchmarks. In 2006 IEEE International Symposium on Workload Characterization. 105–115.Google ScholarGoogle Scholar
  30. Ajay M. Joshi, Lieven Eeckhout, and Lizy Kurian John. 2008. The Return of Synthetic Benchmarks.Google ScholarGoogle Scholar
  31. Norman P. Jouppi. 1990. Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers. In Proceedings of the 17th Annual International Symposium on Computer Architecture (Seattle, Washington, USA) (ISCA ’90). 364–373.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Juuti, S. Szyller, S. Marchal, and N. Asokan. 2019. PRADA: Protecting Against DNN Model Stealing Attacks. In 2019 IEEE European Symposium on Security and Privacy (EuroS P). 512–527. https://doi.org/10.1109/EuroSP.2019.00044Google ScholarGoogle ScholarCross RefCross Ref
  33. M. G. KENDALL. 1938. A NEW MEASURE OF RANK CORRELATION. Biometrika 30, 1-2 (06 1938), 81–93. https://doi.org/10.1093/biomet/30.1-2.81 arXiv:https://academic.oup.com/biomet/article-pdf/30/1-2/81/423380/30-1-2-81.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  34. J. Kim, S. H. Pugsley, P. V. Gratz, A. L. N. Reddy, C. Wilkerson, and Z. Chishti. 2016. Path confidence based lookahead prefetching. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 1–12.Google ScholarGoogle Scholar
  35. Jinchun Kim, Elvira Teran, Paul V. Gratz, Daniel A. Jiménez, Seth H. Pugsley, and Chris Wilkerson. 2017. Kill the Program Counter: Reconstructing Program Behavior in the Processor Cache Hierarchy. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (Xi’an, China) (ASPLOS ’17). 737–749.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Taesung Lee, Benjamin Edwards, Ian Molloy, and Dong Su. 2018. Defending Against Model Stealing Attacks Using Deceptive Perturbations. CoRR abs/1806.00054(2018). arxiv:1806.00054 http://arxiv.org/abs/1806.00054Google ScholarGoogle Scholar
  37. Victor Lee, Derek Bruening, and Parthasarathy Ranganathan. 2022. Google Workload Traces 2022. https://research.google/resources/datasets/google-workload-traces-2022/.Google ScholarGoogle Scholar
  38. Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.Google ScholarGoogle Scholar
  39. Chang Liu, Michael Hicks, and Elaine Shi. 2013. Memory Trace Oblivious Program Execution. In 2013 IEEE 26th Computer Security Foundations Symposium. 51–65. https://doi.org/10.1109/CSF.2013.11Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Evan Zheran Liu, Milad Hashemi, Kevin Swersky, Parthasarathy Ranganathan, and Junwhan Ahn. 2020. An imitation learning approach for cache replacement. In Proceedings of the 37th International Conference on Machine Learning (ICML’20). JMLR.org, Article 579, 11 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (Chicago, IL, USA) (PLDI ’05). ACM, New York, NY, USA, 190–200. https://doi.org/10.1145/1065010.1065034Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Catherine Mills Olschanowsky, Mustafa M. Tikir, Laura Carrington, and Allan Snavely. 2010. PSnAP: Accurate Synthetic Address Streams through Memory Profiles. In Languages and Compilers for Parallel Computing, Guang R. Gao, Lori L. Pollock, John Cavazos, and Xiaoming Li (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 353–367.Google ScholarGoogle Scholar
  43. M. Oskin, F. T. Chong, and M. Farrens. 2000. HLS: combining statistical and symbolic simulation to guide microprocessor designs. In Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201). 71–82.Google ScholarGoogle ScholarCross RefCross Ref
  44. Dag Arne Osvik, Adi Shamir, and Eran Tromer. 2006. Cache attacks and countermeasures: the case of AES. In Cryptographers’ track at the RSA conference. Springer, 1–20.Google ScholarGoogle Scholar
  45. S. Pakalapati and B. Panda. 2020. Bouquet of Instruction Pointers: Instruction Pointer Classifier-based Spatial Hardware Prefetching. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). 118–131.Google ScholarGoogle Scholar
  46. Reena Panda and Lizy Kurian John. 2017. Proxy Benchmarks for Emerging Big-Data Workloads. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). 105–116. https://doi.org/10.1109/PACT.2017.44Google ScholarGoogle ScholarCross RefCross Ref
  47. Reena Panda and Lizy K. John. 2018. HALO: A Hierarchical Memory Access Locality Modeling Technique For Memory System Explorations. In Proceedings of the 2018 International Conference on Supercomputing (Beijing, China) (ICS ’18). Association for Computing Machinery, New York, NY, USA, 118–128. https://doi.org/10.1145/3205289.3205323Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. R. Panda, X. Zheng, and L. K. John. 2017. Accurate address streams for LLC and beyond (SLAB): A methodology to enable system exploration. In 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 87–96.Google ScholarGoogle ScholarCross RefCross Ref
  49. Philippos Papaphilippou, Paul H. J. Kelly, and Wayne Luk. 2019. Pangloss: a novel Markov chain prefetcher. CoRR abs/1906.00877(2019). arxiv:1906.00877 http://arxiv.org/abs/1906.00877Google ScholarGoogle Scholar
  50. Leeor Peled, Shie Mannor, Uri Weiser, and Yoav Etsion. 2015. Semantic Locality and Context-Based Prefetching Using Reinforcement Learning. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA ’15). 285–297.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Amir Roth, Andreas Moshovos, and Gurindar S. Sohi. 1998. Dependence Based Prefetching for Linked Data Structures. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (San Jose, California, USA). 115–126.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Anupam Sanghi, Raghav Sood, Dharmendra Singh, Jayant R. Haritsa, and Srikanta Tirthapura. 2018. HYDRA: A Dynamic Big Data Regenerator. Proc. VLDB Endow. 11, 12 (aug 2018), 1974–1977. https://doi.org/10.14778/3229863.3236238Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Zhan Shi, Xiangru Huang, Akanksha Jain, and Calvin Lin. 2019. Applying Deep Learning to the Cache Replacement Problem. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (Columbus, OH, USA) (MICRO ’52). Association for Computing Machinery, New York, NY, USA, 413–425. https://doi.org/10.1145/3352460.3358319Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Asia Slowinska, Traian Stancescu, and Herbert Bos. 2011. Howard: A Dynamic Excavator for Reverse Engineering Data Structures. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2011, San Diego, California, USA, 6th February - 9th February 2011. The Internet Society. https://www.ndss-symposium.org/ndss2011/howard-a-dynamic-excavator-for-reverse-engineering-data-structuresGoogle ScholarGoogle Scholar
  55. A. Sriraman and T. F. Wenisch. 2018. μSuite: A Benchmark Suite for Microservices. In 2018 IEEE International Symposium on Workload Characterization (IISWC). 1–12.Google ScholarGoogle ScholarCross RefCross Ref
  56. D. Thiebaut, J. L. Wolf, and H. S. Stone. 1992. Synthetic traces for trace-driven simulation of cache memories. IEEE Trans. Comput. 41, 4 (1992), 388–410.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Luk Van Ertvelde and Lieven Eeckhout. 2008. Dispersing Proprietary Applications as Benchmarks through Code Mutation. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (Seattle, WA, USA) (ASPLOS XIII). 201–210.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Y. Wang, A. Awad, and Y. Solihin. 2017. Clone morphing: Creating new workload behavior from existing applications. In 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 97–108.Google ScholarGoogle Scholar
  59. Y. Wang, G. Balakrishnan, and Y. Solihin. 2015. MeToo: Stochastic Modeling of Memory Traffic Timing Behavior. In 2015 International Conference on Parallel Architecture and Compilation (PACT). 457–467.Google ScholarGoogle Scholar
  60. Y. Wang and Y. Solihin. 2015. Emulating cache organizations on real hardware using performance cloning. In 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 298–307.Google ScholarGoogle Scholar
  61. Jonathan Weinberg and Allan Edward Snavely. 2008. Accurate Memory Signatures and Synthetic Address Traces for HPC Applications. In Proceedings of the 22nd Annual International Conference on Supercomputing (Island of Kos, Greece) (ICS ’08). Association for Computing Machinery, New York, NY, USA, 36–45. https://doi.org/10.1145/1375527.1375536Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Carole-Jean Wu, Aamer Jaleel, Will Hasenplaugh, Margaret Martonosi, Simon C. Steely, and Joel Emer. 2011. SHiP: Signature-based Hit Predictor for high performance caching. In 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 430–441.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Xiaofeng Gao, A. Snavely, and L. Carter. 2006. Path Grammar Guided Trace Compression and Trace Approximation. In 2006 15th IEEE International Conference on High Performance Distributed Computing. 57–68.Google ScholarGoogle Scholar
  64. Tianwei Zhang, Fangfei Liu, Si Chen, and Ruby B. Lee. 2013. Side Channel Vulnerability Metrics: The Promise and the Pitfalls. In Proceedings of the 2nd International Workshop on Hardware and Architectural Support for Security and Privacy (Tel-Aviv, Israel) (HASP ’13). Association for Computing Machinery, New York, NY, USA, Article 2, 8 pages. https://doi.org/10.1145/2487726.2487728Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program Traces

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Architecture and Code Optimization
          ACM Transactions on Architecture and Code Optimization Just Accepted
          ISSN:1544-3566
          EISSN:1544-3973
          Table of Contents

          Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Online AM: 29 February 2024
          • Accepted: 6 February 2024
          • Revised: 1 February 2024
          • Received: 20 October 2023
          Published in taco Just Accepted

          Check for updates

          Qualifiers

          • research-article
        • Article Metrics

          • Downloads (Last 12 months)67
          • Downloads (Last 6 weeks)44

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader