Abstract
We in Google's various networking teams would like to increase our collaborations with academic researchers related to data-driven networking research. There are some significant constraints on our ability to directly share data, which are not always widely-understood in the academic community; this document provides a brief summary. We describe some models which can work - primarily, interns and visiting scientists working temporarily as employees, which simplifies the handling of some confidentiality and privacy issues. We describe some specific areas where we would welcome proposals to work within those models.
- Mark Allman and Vern Paxson. 2007. Issues and Etiquette Concerning Use of Shared Measurement Data. In Proc. Conference on Internet Measurement (IMC) (IMC '07). 135--140. Google ScholarDigital Library
- Sam Burnett, Lily Chen, Douglas A. Creager, Misha Efimov, Ilya Grigorik, Ben Jones, Harsha V. Madhyastha, Pavlos Papageorge, Brian Rogan, Charles Stahl, and Julia Tuttle. 2020. Network Error Logging: Client-side measurement of end-to-end web service reliability. In Proc. NSDI. 985--998. https://www.usenix.org/conference/nsdi20/presentation/burnettGoogle Scholar
- Ramesh Govindan, Ina Minei, Mahesh Kallahalla, Bikash Koley, and Amin Vahdat. 2016. Evolve or Die: High-Availability Design Principles Drawn from Google's Network Infrastructure. In Proc. SIGCOMM. http://dl.acm.org/authorize.cfm?key=N19254Google ScholarDigital Library
- Yuliang Li, Gautam Kumar, Hema Hariharan, Hassan Wassel, Peter H. Hochschild, Dave Platt, Simon Sabato, Minlan Yu, Nandita Dukkipati, Prashant Chandra, and Amin Vahdat. 2020. Sundial: Fault-tolerant Clock Synchronization for Data-centers. In Proc. OSDI. 1171--1186. https://www.usenix.org/conference/osdi20/presentation/li-yuliangGoogle Scholar
- Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. 2017. Neural Adaptive Video Streaming with Pensieve. In Proc. SIGCOMM. 197--210. Google ScholarDigital Library
- Charles Reiss, John Wilkes, and Joseph L. Hellerstein. 2012. Obfuscatory obscanturism: making workload traces of commercially-sensitive systems safe to release. In CloudMAN. Maui, HI, USA. http://www.e-wilkes.com/john/papers/2012.04-obfuscation-paper.pdfGoogle Scholar
- Ahmed Saeed, Nandita Dukkipati, Valas Valancius, Terry Lam, Carlo Contavalli, and Amin Vahdat. 2017. Carousel: Scalable Traffic Shaping at End-Hosts. In ACM SIGCOMM 2017. https://research.google/pubs/pub46460/Google ScholarDigital Library
- Francis Y. Yan, Hudson Ayers, Chenzhi Zhu, Sadjad Fouladi, James Hong, Keyi Zhang, Philip Levis, and Keith Winstein. 2020. Learning in situ: a randomized experiment in video streaming. In Proc. NSDI. 495--511. https://www.usenix.org/conference/nsdi20/presentation/yanGoogle Scholar
- Francis Y. Yan, Jestin Ma, Greg D. Hill, Deepti Raghavan, Riad S. Wahby, Philip Levis, and Keith Winstein. 2018. Pantheon: The Training Ground for Internet Congestion-Control Research. In Proc. USENIX Annual Technical Conference (USENIX ATC '18). 731--743.Google Scholar
Index Terms
- Data-driven networking research: models for academic collaboration with industry (a Google point of view)
Recommendations
Differentially private network data release via structural inference
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data miningInformation networks, such as social media and email networks, often contain sensitive information. Releasing such network data could seriously jeopardize individual privacy. Therefore, we need to sanitize network data before the release. In this paper, ...
Correlated network data publication via differential privacy
With the increasing prevalence of information networks, research on privacy-preserving network data publishing has received substantial attention recently. There are two streams of relevant research, targeting different privacy requirements. A large body ...
A survey of state-of-the-art in anonymity metrics
NDA '08: Proceedings of the 1st ACM workshop on Network data anonymizationAnonymization enables organizations to protect their data and systems from a diverse set of attacks and preserve privacy; however, in the area of anonymized network data, few, if any, are able to precisely quantify how anonymized their information is ...
Comments