Abstract
Trace-based simulation is a widely used methodology for system design exploration. It relies on realistic traces that represent a range of behaviors necessary to be evaluated, containing a lot of information about the application, its inputs and the underlying system on which it was generated. Consequently, generating traces from real-world executions risk leakage of sensitive information. To prevent this, traces can be obfuscated before release. However, this can undermine their ideal utility, i.e., how realistically a program behavior was captured. To address this, we propose Camouflage, a novel obfuscation framework, designed with awareness of the necessary architectural properties required to preserve trace utility, while ensuring secrecy of the inputs used to generate the trace. Focusing on memory access traces, our extensive evaluation on various benchmarks shows that camouflaged traces preserve the performance measurements of the original execution, with an average τ correlation of 0.66. We model input secrecy as an input indistinguishability problem and show that the average security loss is 7.8%, which is better than traces generated from the state-of-the-art.
- 2016. Championship Branch Prediction (CBP-5). (2016). https://jilp.org/cbp2016/Google Scholar
- 2017. Cache Replacement Championship (CRC-2). (2017). https://crc2.ece.tamu.edu/Google Scholar
- 2017. ChampSim. (2017). https://github.com/ChampSim/Google Scholar
- 2018. Championship Value Prediction (CVP-1). (2018). https://www.microarch.org/cvp1/cvp1online/program.htmlGoogle Scholar
- 2019. 3rd Data Prefetching Championship.(2019). https://dpc3.compas.cs.stonybrook.edu/?final_programsGoogle Scholar
- Sam Ainsworth and Timothy M. Jones. 2016. Graph Prefetching Using Data Structure Knowledge. In Proceedings of the 2016 International Conference on Supercomputing (ICS ’16). Article 39.Google ScholarDigital Library
- Amro Awad and Yan Solihin. 2014. STM: Cloning the spatial and temporal memory access behavior. In 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA). 237–247.Google ScholarCross Ref
- M. Badr, C. Delconte, I. Edo, R. Jagtap, M. Andreozzi, and N. E. Jerger. 2020. Mocktails: Capturing the Memory Behaviour of Proprietary Mobile Architectures. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). 460–472.Google Scholar
- M. Badr and N. E. Jerger. 2014. SynFull: Synthetic traffic models capturing cache coherent behaviour. In 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA). 109–120.Google ScholarCross Ref
- G. Balakrishnan and Y. Solihin. 2012. WEST: Cloning data cache behavior using Stochastic Traces. In IEEE International Symposium on High-Performance Comp Architecture. 1–12.Google Scholar
- Kevin Barker, Thomas HBenson, Dan Campbell, David Ediger, Roberto Gioiosa, Adolfy Hoisie, Darren Kerbyson, Joseph Manzano, Andres Marquez, Leon Song, Nathan Tallent, and Antonino Tumeo. 2013. PERFECT (Power Efficiency Revolution For Embedded Computing Technologies) Benchmark Suite Manual. Pacific Northwest National Laboratory and Georgia Tech Research Institute. http://hpc.pnnl.gov/projects/PERFECT/.Google Scholar
- Scott Beamer, Krste Asanović, and David Patterson. 2015. The GAP Benchmark Suite. https://doi.org/10.48550/ARXIV.1508.03619Google ScholarCross Ref
- Mihir Bellare, Sriram Keelveedhi, and Thomas Ristenpart. 2013. Message-locked encryption and secure deduplication. In Annual international conference on the theory and applications of cryptographic techniques. Springer, 296–312.Google ScholarCross Ref
- James Bucek, Klaus-Dieter Lange, and Jóakim v. Kistowski. 2018. SPEC CPU2017: Next-Generation Compute Benchmark. In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering (ICPE ’18). 41–42.Google ScholarDigital Library
- Ileana Buhan, Lejla Batina, Yuval Yarom, and Patrick Schaumont. 2021. SoK: Design Tools for Side-Channel-Aware Implementions. arXiv preprint arXiv:2104.08593(2021).Google Scholar
- Dehao Chen, Neil Vachharajani, Robert Hundt, Shih-wei Liao, Vinodha Ramasamy, Paul Yuan, Wenguang Chen, and Weimin Zheng. 2010. Taming Hardware Event Samples for FDO Compilation. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (Toronto, Ontario, Canada) (CGO ’10). 42–52.Google ScholarDigital Library
- Mia Xu Chen, Benjamin N. Lee, Gagan Bansal, Yuan Cao, Shuyuan Zhang, Justin Lu, Jackie Tsay, Yinan Wang, Andrew M. Dai, Zhifeng Chen, Timothy Sohn, and Yonghui Wu. 2019. Gmail Smart Compose: Real-Time Assisted Writing. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 2287–2295. https://doi.org/10.1145/3292500.3330723Google ScholarDigital Library
- Deeksha Dangwal, Weilong Cui, Joseph McMahan, and Timothy Sherwood. 2019. Safer Program Behavior Sharing Through Trace Wringing. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Providence, RI, USA) (ASPLOS ’19). Association for Computing Machinery, New York, NY, USA, 1059–1072. https://doi.org/10.1145/3297858.3304074Google ScholarDigital Library
- Deeksha Dangwal, Zhizhou Zhang, Jedidiah R. Crandall, and Timothy Sherwood. 2021. Context-Aware Privacy-Optimizing Address Tracing. In 2021 International Symposium on Secure and Private Execution Environment Design (SEED). 150–162. https://doi.org/10.1109/SEED51797.2021.00027Google ScholarCross Ref
- John Demme, Robert Martin, Adam Waksman, and Simha Sethumadhavan. 2012. Side-channel vulnerability factor: A metric for measuring information leakage. In 2012 39th Annual International Symposium on Computer Architecture (ISCA). 106–117. https://doi.org/10.1109/ISCA.2012.6237010Google ScholarCross Ref
- L. Eeckhout, K. de Bosschere, and H. Neefs. 2000. Performance analysis through synthetic trace generation. In 2000 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS (Cat. No.00EX422). 1–6.Google Scholar
- Jean-François Gallais, Ilya Kizhvatov, and Michael Tunstall. 2010. Improved trace-driven cache-collision attacks against embedded AES implementations. In International Workshop on Information Security Applications. Springer, 243–257.Google Scholar
- Oded Goldreich and Rafail Ostrovsky. 1996. Software Protection and Simulation on Oblivious RAMs. J. ACM 43, 3 (may 1996), 431–473. https://doi.org/10.1145/233551.233553Google ScholarDigital Library
- Shafi Goldwasser and Silvio Micali. 1984. Probabilistic encryption. Journal of computer and system sciences 28, 2 (1984), 270–299.Google ScholarCross Ref
- Paul Grubbs, Kevin Sekniqi, Vincent Bindschaedler, Muhammad Naveed, and Thomas Ristenpart. 2017. Leakage-abuse attacks against order-revealing encryption. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 655–672.Google ScholarCross Ref
- I. Haller, A. Slowinska, and H. Bos. 2013. MemPick: High-level data structure detection in C/C++ binaries. In 2013 20th Working Conference on Reverse Engineering (WCRE). 32–41. https://doi.org/10.1109/WCRE.2013.6671278Google ScholarCross Ref
- Alon Itai and Michael Slavkin. 2007. Detecting Data Structures from Traces. In Proceedings of the Workshop on Approaches and Applications of Inductive Programming, AAIP’07, September 17, 2007, Warsaw, Poland, Emanuel Kitzelmann and Ute Schmid (Eds.). 39–50. https://cogsys.uni-bamberg.de/events/aaip07/aaip_print.pdf#page=47Google Scholar
- Aamer Jaleel, Kevin B. Theobald, Simon C. Steely, and Joel Emer. 2010. High Performance Cache Replacement Using Re-Reference Interval Prediction (RRIP). In Proceedings of the 37th Annual International Symposium on Computer Architecture (Saint-Malo, France) (ISCA ’10). Association for Computing Machinery, New York, NY, USA, 60–71. https://doi.org/10.1145/1815961.1815971Google ScholarDigital Library
- A. Joshi, L. Eeckhout, R. H. Bell, and L. John. 2006. Performance Cloning: A Technique for Disseminating Proprietary Applications as Benchmarks. In 2006 IEEE International Symposium on Workload Characterization. 105–115.Google Scholar
- Ajay M. Joshi, Lieven Eeckhout, and Lizy Kurian John. 2008. The Return of Synthetic Benchmarks.Google Scholar
- Norman P. Jouppi. 1990. Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers. In Proceedings of the 17th Annual International Symposium on Computer Architecture (Seattle, Washington, USA) (ISCA ’90). 364–373.Google ScholarDigital Library
- M. Juuti, S. Szyller, S. Marchal, and N. Asokan. 2019. PRADA: Protecting Against DNN Model Stealing Attacks. In 2019 IEEE European Symposium on Security and Privacy (EuroS P). 512–527. https://doi.org/10.1109/EuroSP.2019.00044Google ScholarCross Ref
- M. G. KENDALL. 1938. A NEW MEASURE OF RANK CORRELATION. Biometrika 30, 1-2 (06 1938), 81–93. https://doi.org/10.1093/biomet/30.1-2.81 arXiv:https://academic.oup.com/biomet/article-pdf/30/1-2/81/423380/30-1-2-81.pdfGoogle ScholarCross Ref
- J. Kim, S. H. Pugsley, P. V. Gratz, A. L. N. Reddy, C. Wilkerson, and Z. Chishti. 2016. Path confidence based lookahead prefetching. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 1–12.Google Scholar
- Jinchun Kim, Elvira Teran, Paul V. Gratz, Daniel A. Jiménez, Seth H. Pugsley, and Chris Wilkerson. 2017. Kill the Program Counter: Reconstructing Program Behavior in the Processor Cache Hierarchy. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (Xi’an, China) (ASPLOS ’17). 737–749.Google ScholarDigital Library
- Taesung Lee, Benjamin Edwards, Ian Molloy, and Dong Su. 2018. Defending Against Model Stealing Attacks Using Deceptive Perturbations. CoRR abs/1806.00054(2018). arxiv:1806.00054 http://arxiv.org/abs/1806.00054Google Scholar
- Victor Lee, Derek Bruening, and Parthasarathy Ranganathan. 2022. Google Workload Traces 2022. https://research.google/resources/datasets/google-workload-traces-2022/.Google Scholar
- Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.Google Scholar
- Chang Liu, Michael Hicks, and Elaine Shi. 2013. Memory Trace Oblivious Program Execution. In 2013 IEEE 26th Computer Security Foundations Symposium. 51–65. https://doi.org/10.1109/CSF.2013.11Google ScholarDigital Library
- Evan Zheran Liu, Milad Hashemi, Kevin Swersky, Parthasarathy Ranganathan, and Junwhan Ahn. 2020. An imitation learning approach for cache replacement. In Proceedings of the 37th International Conference on Machine Learning (ICML’20). JMLR.org, Article 579, 11 pages.Google ScholarDigital Library
- Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (Chicago, IL, USA) (PLDI ’05). ACM, New York, NY, USA, 190–200. https://doi.org/10.1145/1065010.1065034Google ScholarDigital Library
- Catherine Mills Olschanowsky, Mustafa M. Tikir, Laura Carrington, and Allan Snavely. 2010. PSnAP: Accurate Synthetic Address Streams through Memory Profiles. In Languages and Compilers for Parallel Computing, Guang R. Gao, Lori L. Pollock, John Cavazos, and Xiaoming Li (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 353–367.Google Scholar
- M. Oskin, F. T. Chong, and M. Farrens. 2000. HLS: combining statistical and symbolic simulation to guide microprocessor designs. In Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201). 71–82.Google ScholarCross Ref
- Dag Arne Osvik, Adi Shamir, and Eran Tromer. 2006. Cache attacks and countermeasures: the case of AES. In Cryptographers’ track at the RSA conference. Springer, 1–20.Google Scholar
- S. Pakalapati and B. Panda. 2020. Bouquet of Instruction Pointers: Instruction Pointer Classifier-based Spatial Hardware Prefetching. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). 118–131.Google Scholar
- Reena Panda and Lizy Kurian John. 2017. Proxy Benchmarks for Emerging Big-Data Workloads. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). 105–116. https://doi.org/10.1109/PACT.2017.44Google ScholarCross Ref
- Reena Panda and Lizy K. John. 2018. HALO: A Hierarchical Memory Access Locality Modeling Technique For Memory System Explorations. In Proceedings of the 2018 International Conference on Supercomputing (Beijing, China) (ICS ’18). Association for Computing Machinery, New York, NY, USA, 118–128. https://doi.org/10.1145/3205289.3205323Google ScholarDigital Library
- R. Panda, X. Zheng, and L. K. John. 2017. Accurate address streams for LLC and beyond (SLAB): A methodology to enable system exploration. In 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 87–96.Google ScholarCross Ref
- Philippos Papaphilippou, Paul H. J. Kelly, and Wayne Luk. 2019. Pangloss: a novel Markov chain prefetcher. CoRR abs/1906.00877(2019). arxiv:1906.00877 http://arxiv.org/abs/1906.00877Google Scholar
- Leeor Peled, Shie Mannor, Uri Weiser, and Yoav Etsion. 2015. Semantic Locality and Context-Based Prefetching Using Reinforcement Learning. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA ’15). 285–297.Google ScholarDigital Library
- Amir Roth, Andreas Moshovos, and Gurindar S. Sohi. 1998. Dependence Based Prefetching for Linked Data Structures. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (San Jose, California, USA). 115–126.Google ScholarDigital Library
- Anupam Sanghi, Raghav Sood, Dharmendra Singh, Jayant R. Haritsa, and Srikanta Tirthapura. 2018. HYDRA: A Dynamic Big Data Regenerator. Proc. VLDB Endow. 11, 12 (aug 2018), 1974–1977. https://doi.org/10.14778/3229863.3236238Google ScholarDigital Library
- Zhan Shi, Xiangru Huang, Akanksha Jain, and Calvin Lin. 2019. Applying Deep Learning to the Cache Replacement Problem. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (Columbus, OH, USA) (MICRO ’52). Association for Computing Machinery, New York, NY, USA, 413–425. https://doi.org/10.1145/3352460.3358319Google ScholarDigital Library
- Asia Slowinska, Traian Stancescu, and Herbert Bos. 2011. Howard: A Dynamic Excavator for Reverse Engineering Data Structures. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2011, San Diego, California, USA, 6th February - 9th February 2011. The Internet Society. https://www.ndss-symposium.org/ndss2011/howard-a-dynamic-excavator-for-reverse-engineering-data-structuresGoogle Scholar
- A. Sriraman and T. F. Wenisch. 2018. μSuite: A Benchmark Suite for Microservices. In 2018 IEEE International Symposium on Workload Characterization (IISWC). 1–12.Google ScholarCross Ref
- D. Thiebaut, J. L. Wolf, and H. S. Stone. 1992. Synthetic traces for trace-driven simulation of cache memories. IEEE Trans. Comput. 41, 4 (1992), 388–410.Google ScholarDigital Library
- Luk Van Ertvelde and Lieven Eeckhout. 2008. Dispersing Proprietary Applications as Benchmarks through Code Mutation. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (Seattle, WA, USA) (ASPLOS XIII). 201–210.Google ScholarDigital Library
- Y. Wang, A. Awad, and Y. Solihin. 2017. Clone morphing: Creating new workload behavior from existing applications. In 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 97–108.Google Scholar
- Y. Wang, G. Balakrishnan, and Y. Solihin. 2015. MeToo: Stochastic Modeling of Memory Traffic Timing Behavior. In 2015 International Conference on Parallel Architecture and Compilation (PACT). 457–467.Google Scholar
- Y. Wang and Y. Solihin. 2015. Emulating cache organizations on real hardware using performance cloning. In 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 298–307.Google Scholar
- Jonathan Weinberg and Allan Edward Snavely. 2008. Accurate Memory Signatures and Synthetic Address Traces for HPC Applications. In Proceedings of the 22nd Annual International Conference on Supercomputing (Island of Kos, Greece) (ICS ’08). Association for Computing Machinery, New York, NY, USA, 36–45. https://doi.org/10.1145/1375527.1375536Google ScholarDigital Library
- Carole-Jean Wu, Aamer Jaleel, Will Hasenplaugh, Margaret Martonosi, Simon C. Steely, and Joel Emer. 2011. SHiP: Signature-based Hit Predictor for high performance caching. In 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 430–441.Google ScholarDigital Library
- Xiaofeng Gao, A. Snavely, and L. Carter. 2006. Path Grammar Guided Trace Compression and Trace Approximation. In 2006 15th IEEE International Conference on High Performance Distributed Computing. 57–68.Google Scholar
- Tianwei Zhang, Fangfei Liu, Si Chen, and Ruby B. Lee. 2013. Side Channel Vulnerability Metrics: The Promise and the Pitfalls. In Proceedings of the 2nd International Workshop on Hardware and Architectural Support for Security and Privacy (Tel-Aviv, Israel) (HASP ’13). Association for Computing Machinery, New York, NY, USA, Article 2, 8 pages. https://doi.org/10.1145/2487726.2487728Google ScholarDigital Library
Index Terms
- Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program Traces
Recommendations
Safer Program Behavior Sharing Through Trace Wringing
ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating SystemsWhen working towards application-tuned systems, developers often find themselves caught between the need to share information (so that partners can make intelligent design choices) and the need to hide information (to protect proprietary methods or ...
Obfuscation: The Hidden Malware
A cyberwar exists between malware writers and antimalware researchers. At this war's heart rages a weapons race that originated in the 80s with the first computer virus. Obfuscation is one of the latest strategies to camouflage the telltale signs of ...
Utility-Aware Synthesis of Differentially Private and Attack-Resilient Location Traces
CCS '18: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications SecurityAs mobile devices and location-based services become increasingly ubiquitous, the privacy of mobile users' location traces continues to be a major concern. Traditional privacy solutions rely on perturbing each position in a user's trace and replacing it ...
Comments