skip to main content
research-article

Topology and Geometry of the Third-Party Domains Ecosystem: Measurement and Applications

Published:19 December 2022Publication History
Skip Abstract Section

Abstract

Over the years, web content has evolved from simple text and static images hosted on a single server to a complex, interactive and multimedia-rich content hosted on different servers. As a result, a modern website during its loading time fetches content not only from its owner's domain but also from a range of third-party domains providing additional functionalities and services. Here, we infer the network of the third-party domains by observing the domains' interactions within users' browsers from all over the globe. We find that this network possesses structural properties commonly found in complex networks, such as power-law degree distribution, strong clustering, and small-world property. These properties imply that a hyperbolic geometry underlies the ecosystem's topology. We use statistical inference methods to find the domains' coordinates in this geometry, which abstract how popular and similar the domains are. The hyperbolic map we obtain is meaningful, revealing the large-scale organization of the ecosystem. Furthermore, we show that it possesses predictive power, providing us the likelihood that third-party domains are co-hosted; belong to the same legal entity; or merge under the same entity in the future in terms of company acquisition. We also find that complementarity instead of similarity is the dominant force driving future domains' merging. These results provide a new perspective on understanding the ecosystem's organization and performing related inferences and predictions.

Skip Supplemental Material Section

Supplemental Material

References

  1. G. Acar, M. Juárez, N. Nikiforakis, C. Díaz, S. F. Gürses, F. Piessens, and B. Preneel. 2013. FPDetective: dusting the web for fingerprinters. In Proc. of the SIGSAC conference on Computer & communications security CCS'13, 13 (2013), 1129--1140.Google ScholarGoogle Scholar
  2. Appen. 2021. Appen - Leading technology platform. https://appen.com/. Accessed July 2021.Google ScholarGoogle Scholar
  3. P. Bangera and S. Gorinsky. 2017. Ads versus Regular Contents: Dissecting the Web Hosting Ecosystem. In Proc. of the IFIP Networking. IEEE, Stockholm, Sweden, 1--9.Google ScholarGoogle Scholar
  4. M. A. Bashir, S. Arshad, E. Kirda, W. Robertson, and C. Wilson. 2018. How Tracking Companies Circumvent Ad Blockers Using WebSockets. In Proc. of the Workshop on Technology and Consumer Protection. ACM, Boston, MA, 471--477.Google ScholarGoogle Scholar
  5. M. A. Bashir, S. Arshad, W. Robertson, and C. Wilson. 2016. Tracing Information Flows between Ad Exchanges Using Retargeted Ads. In Proc. of the 25th USENIX Conference on Security Symposium (Austin, TX, USA) (SEC'16). USENIX Association, USA, 481--496.Google ScholarGoogle Scholar
  6. M. A. Bashir and C. Wilson. 2018. Diffusion of User Tracking Data in the Online Advertising Ecosystem. In Proc. of the Privacy Enhancing Technologies 2018 (2018), 103 -- 85.Google ScholarGoogle ScholarCross RefCross Ref
  7. R. Binns, U. Lyngs, M. Van Kleek, J. Zhao, T. Libert, and N. Shadbolt. 2018. Third Party Tracking in the Mobile Ecosystem. In Proc. of the 10th Conference on Web Science. ACM, Amsterdam, Netherlands, 23--31.Google ScholarGoogle Scholar
  8. T. Bläsius, T. Friedrich, A. Krohmer, and S. Laue. 2018. Efficient Embedding of Scale-Free Graphs in the Hyperbolic Plane. IEEE/ACM Transactions on Networking 26, 2 (2018), 920--933.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Boguñá, I. Bonamassa, M. De Domenico, S. Havlin, D. Krioukov, and M. Serrano. 2021. Network geometry. Nature Reviews Physics 1, 3 (Jan. 2021), 114--135.Google ScholarGoogle Scholar
  10. M. Boguñá, D. Krioukov, and K. C. Claffy. 2008. Navigability of complex networks. Nature Physics 5, 1 (Nov. 2008), 74--80.Google ScholarGoogle Scholar
  11. M. Boguñá, F. Papadopoulos, and D. Krioukov. 2010. Sustaining the Internet with hyperbolic mapping. Nature communications 1 (2010), 62.Google ScholarGoogle Scholar
  12. G. Brajnik and S. Gabrielli. 2010. A Review of Online Advertising Effects on the User Experience. International Journal of Human-Computer Interaction 26 (09 2010), 971--997.Google ScholarGoogle Scholar
  13. Interactive Advertising Bureau. 2021. Interactive Advertising Bureau: Open-RTB (Real-Time Bidding). https://www.iab.com/guidelines/real-time-bidding-rtb-project/. Accessed July 2021.Google ScholarGoogle Scholar
  14. M. Butkiewicz, H. V. Madhyastha, and V. Sekar. 2011. Understanding Website Complexity: Measurements, Metrics, and Implications. In Proc. of the Internet Measurement Conference (Berlin, Germany) (IMC'11). ACM, New York, NY, USA, 313--328.Google ScholarGoogle Scholar
  15. J. M. Carrascosa, J. Mikians, R. Cuevas, V. Erramilli, and N. Laoutaris. 2015. I Always Feel Like Somebody's Watching Me: Measuring Online Behavioural Advertising. In Proc. of the Conference on Emerging Networking Experiments and Technologies. ACM, Heidelberg, Germany, 1--13.Google ScholarGoogle Scholar
  16. A. Carson. 2021. What happens to ad tech post-GDPR. https://iapp.org/news/a/no-seriously-what-happens-to-ad-tech-post-gdpr/. Accessed October 2021.Google ScholarGoogle Scholar
  17. Q. Chen, P. Ilia, M. Polychronakis, and A. Kapravelos. 2021. Cookie Swap Party: Abusing First-Party Cookies for Web Tracking. In Proc. of the Web Conference (Ljubljana, Slovenia) (WWW'21). ACM, New York, NY, USA, 2117--2129.Google ScholarGoogle Scholar
  18. The U.S. House Judiciary Committee. 2019. Investigation of competition in digital markets. Available: https://judiciary.house.gov/uploadedfiles/competition_in_digital_markets.pdf.Google ScholarGoogle Scholar
  19. M. R. Dhote and G. G. Sarate. 2013. Performance Testing Complexity Analysis on Ajax-Based Web Applications. IEEE Software 30, 6 (2013), 70--74.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. disconnect.me. 2021. Disconnect - The Tracker Protection lists. https://github.com/disconnectme/disconnect-tracking-protection. Accessed July 2021.Google ScholarGoogle Scholar
  21. S. N. Dorogovtsev. 2010. Lectures on Complex Networks. Oxford University Press, Oxford.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Englehardt and A. Narayanan. 2016. Online Tracking: A 1-million-site Measurement and Analysis. In Proc. of the CCS'16. ACM, Korea, 1388--1401.Google ScholarGoogle Scholar
  23. S. Englehardt, D. Reisman, C. Eubank, P. Zimmerman, J. Mayer, A. Narayanan, and E. Felten. 2015. Cookies That Give You Away: The Surveillance Implications of Web Tracking. In Proc. of the 24th International Conference on World Wide Web. ACM, Florence, Italy, 289--299.Google ScholarGoogle Scholar
  24. S. Eskandari, A. Leoutsarakos, T. Mursch, and J. Clark. 2018. A First Look at Browser-Based Cryptojacking. In Proc. of the European Symposium on Security and Privacy Workshops (EuroS PW). IEEE, London, UK, 58--66.Google ScholarGoogle Scholar
  25. EU. 2016. The EU General Data Protection Regulation. http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679&qid=1490179745294&from=en. Accessed August 2021.Google ScholarGoogle Scholar
  26. Weimer F. 2005. Passive DNS Replication. 17th Annual FIRST Conference 1, 1 (2005), 1--13.Google ScholarGoogle Scholar
  27. M. Falahrastegar, H. Haddadi, S. Uhlig, and R. Mortier. 2014. Anatomy of the Third-Party Web Tracking Ecosystem. arXiv:1409.1066 [cs.SI]Google ScholarGoogle Scholar
  28. M. Falahrastegar, H. Haddadi, S. Uhlig, and R. Mortier. 2016. Tracking Personal Identifiers Across the Web. In Proc. of the Passive and Active Measurement. Springer, Cham, 30--41.Google ScholarGoogle Scholar
  29. G. García-Pérez, A. Allard, M. A. Serrano, and M. Boguñá. 2019. Mercator: uncovering faithful hyperbolic embeddings of complex networks. New Journal of Physics 21, 12 (2019), 10.Google ScholarGoogle ScholarCross RefCross Ref
  30. G. García-Pérez, A. Allard, M. A. Serrano, and M. Boguñá. 2020. Mercator Embedding Code. Available: https://github.com/networkgeometry/mercator.Google ScholarGoogle Scholar
  31. A. Gervais, A. Filios, V. Lenders, and S. Capkun. 2017. Quantifying Web Adblocker Privacy, In Proc. of the European Symposium on Research in Computer Security. ESORICS 1, 2017, 21--42.Google ScholarGoogle Scholar
  32. Cliqz GmbH. 2021. Learn about tracking technologies, market structure and data-sharing on the web. https://whotracks.me/. Accessed July 2021.Google ScholarGoogle Scholar
  33. R. C. Gomer, E. M. Rodrigues, N. Milic-Frayling, and M. Schraefel. 2013. Network Analysis of Third Party Tracking: User Exposure to Tracking Cookies through Search. In Proc. of the IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies 1 (2013), 549--556.Google ScholarGoogle Scholar
  34. Google. 2021. Google Chrome API - webNavigation. https://developer.chrome.com/docs/extensions/reference/webNavigation/. Accessed October 2021.Google ScholarGoogle Scholar
  35. Google. 2021. Google Chrome API - webRequest. https://developer.chrome.com/docs/extensions/reference/webRequest/. Accessed October 2021.Google ScholarGoogle Scholar
  36. Google. 2021. Google Chrome Web store. https://chrome.google.com/webstore/category/extensions. Accessed July 2021.Google ScholarGoogle Scholar
  37. Crunchbase Inc. 2021. Discover innovative companies and the people behind them. https://www.crunchbase.com/. Accessed July 2021.Google ScholarGoogle Scholar
  38. C. Iordanou, N. Kourtellis, J. M. Carrascosa, C. Soriente, R. Cuevas, and N. Laoutaris. 2019. Beyond Content Analysis: Detecting Targeted Ads via Distributed Counting. In Proc. of the 15th International Conference on Emerging Networking Experiments And Technologies (Orlando, Florida) (CoNEXT'19). ACM, New York, NY, USA, 110--122.Google ScholarGoogle Scholar
  39. C. Iordanou and F. Papadopoulos. 2022. Topology and Geometry of the Third-Party Domains Ecosystem: Measurement and Applications - Artifacts Repository. Available upon publication: https://github.com/cosior/TPDs_ecosystem.Google ScholarGoogle Scholar
  40. C. Iordanou, G. Smaragdakis, I. Poese, and N. Laoutaris. 2018. Tracing Cross Border Web Tracking. In Proc. of the Internet Measurement Conference. ACM, Boston, MA, USA, 329--342.Google ScholarGoogle Scholar
  41. Doh-Shin Jeon. 2021. Market Power and Transparency in Open Display Advertising - A Case Study. http://publications.ut-capitole.fr/43644/. Accessed January 2021.Google ScholarGoogle Scholar
  42. A. Karaj, S. Macbeth, R. Berson, and J. M. Pujol. 2018. WhoTracks.Me: Shedding light on the opaque world of online tracking. arXiv:1804.08959 [cs.CY]Google ScholarGoogle Scholar
  43. Beatriz Kira, Vikram Sinha, and Sharmadha Srinivasan. 2021. Regulating digital ecosystems: bridging the gap between competition policy and data protection. Industrial and Corporate Change 30, 5 (08 2021), 1337--1360.Google ScholarGoogle Scholar
  44. R. K. Konothdhesh, E. Vineti, V. Moonsamy, M. Lindorfer, C. Kruegel, H. Bos, and G. Vigna. 2018. MineSweeper: An In-Depth Look into Drive-by Cryptocurrency Mining and Its Defense. In Proc. of the SIGSAC Conference on Computer and Communications Security (Toronto, Canada) (CCS'18). ACM, New York, NY, USA, 1714--1730.Google ScholarGoogle Scholar
  45. D. Krioukov, F. Papadopoulos, M. Kitsak, A. Vahdat, and M. Boguñá. 2010. Hyperbolic geometry of complex networks. Phys. Rev. E 82 (Sep 2010), 036106. Issue 3.Google ScholarGoogle ScholarCross RefCross Ref
  46. S. Kumar, S. S. Rautaray, and M. Pandey. 2017. Malvertising: A case study based on analysis of possible solutions. In International Conference on Inventive Computing and Informatics (ICICI). IEEE, Coimbatore, India, 288--291.Google ScholarGoogle Scholar
  47. A. Lerner, A. Kornfeld Simpson, T. Kohno, and F. Roesner. 2016. Internet Jones and the Raiders of the Lost Trackers: An Archaeological Study of Web Tracking from 1996 to 2016. In Proc. of the USENIX Security Symposium. USENIX, Austin. TX, USA, 997--1013.Google ScholarGoogle Scholar
  48. C. Leung, J. Ren, D. Choffnes, and C. Wilson. 2016. Should You Use the App for That?: Comparing the Privacy Implications of App- and Web-based Online Services. In Proc. of the Internet Measurement Conference. ACM, New York, NY, USA, 365--372.Google ScholarGoogle Scholar
  49. Mozilla. 2021. Same-origin policy (SOP). https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy. Accessed July 2021.Google ScholarGoogle Scholar
  50. A. Muscoloni, J. M. Thomas, S. Ciucci, G. Bianconi, and C. V. Cannistraci. 2017. Machine learning meets complex networks via coalescent embedding in the hyperbolic space. Nature Communications 8, 1 (2017), 1615.Google ScholarGoogle ScholarCross RefCross Ref
  51. State of California. 2018. California Consumer Privacy Act - Assembly Bill No. 375. Available at https://leginfo.legislature.ca.gov. Accessed June 2021.Google ScholarGoogle Scholar
  52. F. Papadopoulos, R. Aldecoa, and D. Krioukov. 2015. Network geometry inference using common neighbors. Phys. Rev. E 92 (2015), 022807. Issue 2.Google ScholarGoogle ScholarCross RefCross Ref
  53. F. Papadopoulos, M. Kitsak, M. Serrano, M. Boguñá, and D. Krioukov. 2012. Popularity versus Similarity in Growing Networks. Nature 489 (09 2012), 537--40.Google ScholarGoogle Scholar
  54. F. Papadopoulos, C. Psomas, and D. Krioukov. 2015. Network Mapping by Replaying Hyperbolic Growth. IEEE/ACM Transactions on Networking 23, 1 (2015), 198--211.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. E. Pujol, O. Hohlfeld, and A. Feldmann. 2015. Annoyed Users: Ads and Ad-Block Usage in the Wild. In In Proc. of the Internet Measurement Conference. ACM, Tokyo, Japan, 93--106.Google ScholarGoogle Scholar
  56. A. Razaghpanah, R. Nithyanand, N. Vallina-Rodriguez, S. Sundaresan, M. Allman, C. Kreibich, and P. Gill. 2018. Apps, Trackers, Privacy, and Regulators: A Global Study of the Mobile Tracking Ecosystem. In Proc. of the Network and Distributed System Security Symposium. NDSS, San Diego, CA, USA, 1.Google ScholarGoogle Scholar
  57. I. Reyes, P. Wijesekera, J. Reardon, Amit Elazari Bar On, A. Razaghpanah, N. Vallina-Rodriguez, and S. Egelman. 2018. "Won't Somebody Think of the Children?" Examining COPPA Compliance at Scale. In Proc. of the Privacy Enhancing Technologies 2018 (2018), 63 -- 83.Google ScholarGoogle ScholarCross RefCross Ref
  58. J. Rüth, T. Zimmermann, K. Wolsing, and O. Hohlfeld. 2018. Digging into Browser-Based Crypto Mining. In Proc. of the Internet Measurement Conference (Boston, MA, USA) (IMC'18). ACM, New York, NY, USA, 70--76.Google ScholarGoogle Scholar
  59. A. Schelter and J. Kunegis. 2016. On the Ubiquity of Web Tracking: Insights from a Billion-Page Web Crawl. Journal of Web Science 4 (07 2016).Google ScholarGoogle Scholar
  60. M. Siddiqui, M. Wang, and J. Lee. 2008. Data mining methods for malware detection using instruction sequences. In In Proc. of the 26th IASTED International Conference on Artificial Intelligence and Applications. ACTA Press, Innsbruck, Austria, 358--363.Google ScholarGoogle Scholar
  61. K. Solomos, P. Ilia, S. Ioannidis, and N. Kourtellis. 2019. Clash of the Trackers: Measuring the Evolution of the Online Tracking Ecosystem. In Proc. of the Network Traffic Measurement and Analysis Conference. TMA, Berlin, Germany, 10.Google ScholarGoogle Scholar
  62. A. Sood and R. J. Enbody. 2011. Malvertising - Exploiting web advertising. Computer Fraud & Security 2011 (04 2011), 11--16.Google ScholarGoogle Scholar
  63. Dina Srinivasan. 2020. Why Google dominates advertising markets. Stan. Tech. L. Rev. 24 (2020), 55.Google ScholarGoogle Scholar
  64. Y. Takano, S. Ohta, T. Takahashi, R. Ando, and T. Inoue. 2014. MindYourPrivacy: Design and implementation of a visualization system for third-party Web tracking. In Proc. Of the Annual International Conference on Privacy, Security and Trust. IEEE, Toronto, Canada, 48--56.Google ScholarGoogle Scholar
  65. T. Urban, M. Degeling, T. Holz, and N. Pohlmann. 2020. Beyond the Front Page: Measuring Third Party Dynamics in the Field. In Proc. of The Web Conference (Taipei, Taiwan) (WWW'20). ACM, New York, NY, USA, 1275--1286.Google ScholarGoogle Scholar
  66. US. 1998. Children's Online Privacy Protection Act (COPPA). https://www.ftc.gov/enforcement/rules/rulemaking-regulatory-reform-proceedings/childrens-online-privacy-protection-rule. Accessed September 2021.Google ScholarGoogle Scholar
  67. G. Venkatadri, A. Mislove, and K. P. Gummadi. 2018. Treads: Transparency-Enhancing Ads. In Proc. of the HotNets Workshop. ACM, New York, NY, USA, 169--175.Google ScholarGoogle Scholar
  68. R. J. Walls, E. D. Kilmer, N. Lageman, and P. D. McDaniel. 2015. Measuring the Impact and Perception of Acceptable Advertisements. In Proc. of the Internet Measurement Conference. ACM, Tokyo, Japan, 107--120.Google ScholarGoogle Scholar
  69. webxray.org. 2021. webXray - A tool to identifying the companies which collect user data. https://github.com/timlib/webXray. Accessed July 2021.Google ScholarGoogle Scholar
  70. wired.co.uk. 2021. All the data WhatsApp and Instagram send to Facebook. https://www.wired.co.uk/article/whatsapp-instagram-facebook-data. Accessed July 2021.Google ScholarGoogle Scholar
  71. Z. Wu, Z. Di, Y. Fan, and D. Shen. 2020. An Asymmetric Popularity-Similarity Optimization Method for Embedding Directed Networks into Hyperbolic Space. Complexity 2020 (Jan 2020), 16 pages.Google ScholarGoogle Scholar
  72. S. Yi, H. Jiang, Y. Jiang, P. Zhou, and Q. Wang. 2021. A Hyperbolic Embedding Method for Weighted Networks. IEEE Transactions on Network Science and Engineering 8, 1 (2021), 599--612.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Topology and Geometry of the Third-Party Domains Ecosystem: Measurement and Applications

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Published in

                  cover image ACM SIGCOMM Computer Communication Review
                  ACM SIGCOMM Computer Communication Review  Volume 52, Issue 4
                  October 2022
                  30 pages
                  ISSN:0146-4833
                  DOI:10.1145/3577929
                  Issue’s Table of Contents

                  Copyright © 2022 Copyright is held by the owner/author(s)

                  Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 19 December 2022

                  Check for updates

                  Qualifiers

                  • research-article
                • Article Metrics

                  • Downloads (Last 12 months)31
                  • Downloads (Last 6 weeks)4

                  Other Metrics

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader