skip to main content
research-article

Protocol Responsibility Offloading to Improve TCP Throughput in Virtualized Environments

Published:01 August 2013Publication History
Skip Abstract Section

Abstract

Virtualization is a key technology that powers cloud computing platforms such as Amazon EC2. Virtual machine (VM) consolidation, where multiple VMs share a physical host, has seen rapid adoption in practice, with increasingly large numbers of VMs per machine and per CPU core. Our investigations, however, suggest that the increasing degree of VM consolidation has serious negative effects on the VMs’ TCP performance. As multiple VMs share a given CPU, the scheduling latencies, which can be in the order of tens of milliseconds, substantially increase the typically submillisecond round-trip times (RTTs) for TCP connections in a datacenter, causing significant degradation in throughput. In this article, we propose a lightweight solution, called vPRO, that (a) offloads the VM’s TCP congestion control function to the driver domain to improve TCP transmit performance; and (b) offloads TCP acknowledgment functionality to the driver domain to improve the TCP receive performance. Our evaluation of a vPRO prototype on Xen suggests that vPRO substantially improves TCP receive and transmit throughputs with minimal per-packet CPU overhead. We further show that the higher TCP throughput leads to improvement in application-level performance, via experiments with Apache Olio, a Web 2.0 cloud application, and Intel MPI benchmark.

References

  1. Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data center TCP (DCTCP). In Proceedings of the ACM SIGCOMM Conference (SIGCOMM’10). 63--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Amazon EC2. 2013. Amazon EC2 instance types. http://aws.amazon.com/ec2/instance-types/.Google ScholarGoogle Scholar
  3. Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David A. Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. 2009. Above the Clouds: A Berkeley view of cloud computing. Tech. rep. UCB/EECS-2009-28, EECS Dept., University of California, Berkeley.Google ScholarGoogle Scholar
  4. Hari Balakrishnan, Srinivasan Seshan, Elan Amir, and Randy H. Katz. 1995. Improving TCP/IP performance over wireless networks. In Proceedings of the 1st Annual International Conference on Mobile Computing and Networking (MobiComm’95). 2--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Theophilus Benson, Ashok Anand, Aditya Akella, and Ming Zhang. 2010. Understanding data center traffic characteristics. SIGCOMM Comput. Comm. Rev. 40, 1, 92--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. S. Brakmo and L. L. Peterson. 1995. TCP Vegas: End to end congestion avoidance on a global Internet. IEEE J. Select. Areas Comm. 13, 8, 1465--1480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Anton Burtsev, Kiran Srinivasan, Prashanth Radhakrishnan, Lakshmi N. Bairavasundaram, Kaladhar Vornganti, and Garth R. Goodson. 2009. Fido: Fast inter-virtual-machine communication for enterprise appliances. In Proceedings of the Conference on USENIX Annual Technical Conference (ATC’09). 25--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Abhijit K. Choudhury and Ellen L. Hahne. 1998. Dynamic queue length thresholds for shared-memory packet switches. IEEE/ACM Trans. Networking 6, 2, 130--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Dean and S. Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation (OSDI’’04). 10--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yaozu Dong, Zhao Yu, and Greg Rose. 2008. SR-IOV networking in Xen: Architecture, design and implementation. In Proceedings of the 1st Conference on I/O Virtualization (WIOV’08). 10--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. FABAN. 2013. Project Faban. https://java.net/projects/faban/.Google ScholarGoogle Scholar
  12. Sahan Gamage, Ardalan Kangarlou, Ramana Rao Kompella, and Dongyan Xu. 2011. Opportunistic flooding to improve TCP transmit performance in virtualized clouds. In Proceedings of the 2nd ACM Symposium on Cloud Computing (SoCC’11). 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Sriram Govindan, Arjun R. Nath, Amitayu Das, Bhuvan Urgaonkar, and Anand Sivasubramaniam. 2007. Xen and Co.: Communication-aware CPU scheduling for consolidated Xen-based hosting platforms. In Proceedings of the 3rd International Conference on Virtual Execution Environments (VEE’07). 126--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A. Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: A scalable and flexible data center network. In Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication (SIGCOMM’09). 51--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ajay Gulati, Arif Merchant, and Peter Varman. 2010. mClock: Handling throughput variability for hypervisor IO scheduling. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI’10). 1--7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Diwaker Gupta, Ludmila Cherkasova, Rob Gardner, and Amin Vahdat. 2006. Enforcing performance isolation across virtual machines in Xen. In Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware (Middleware’06). 342--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Diwaker Gupta, Sangmin Lee, Michael Vrable, Stefan Savage, Alex C. Snoeren, George Varghese, Geoffrey M. Voelker, and Amin Vahdat. 2008. Difference engine: Harnessing memory redundancy in virtual machines. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI’08). 309--322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sangtae Ha, Injong Rhee, and Lisong Xu. 2008. CUBIC: A new TCP-friendly high-speed TCP variant. ACM SIGOPS Oper. Syst. Rev. 42, 5, 64--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Wei Huang, Matthew J. Koop, Qi Gao, and Dhabaleswar K. Panda. 2007. Virtual machine aware communication libraries for high performance computing. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC’07). 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. IMB. 2011. Intel MPI benchmark. http://software.intel.com/en-us/articles/intel-mpi-benchmarks/.Google ScholarGoogle Scholar
  21. IPERF. 2013. The Iperf benchmark. https://code.google.com/p/iperf/.Google ScholarGoogle Scholar
  22. Srikanth Kandula, Sudipta Sengupta, Albert Greenberg, Parveen Patel, and Ronnie Chaiken. 2009. The nature of data center traffic: Measurements and analysis. In Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Conference (IMC’09). 202--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Hui Kang, Yao Chen, Jennifer L. Wong, Radu Sion, and Jason Wu. 2011. Enhancement of Xen’s Scheduler for MapReduce workloads. In Proceedings of the 20th International Symposium on High Performance Distributed Computing (HPDC’11). 251--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ardalan Kangarlou, Sahan Gamage, Ramana Rao Kompella, and Dongyan Xu. 2010. vSnoop: Improving TCP throughput in virtualized environments via acknowledgement offload. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC’10). 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Mukil Kesavan, Ada Gavrilovska, and Karsten Schwan. 2010a. Differential virtual time (DVT): Rethinking I/O service differentiation for virtual machines. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10). 27--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mukil Kesavan, Ada Gavrilovska, and Karsten Schwan. 2010b. On disk scheduling in virtual machines. In Proceedings of the 2nd Workshop on I/O Virtualization (WIOV’10). 6--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Kangho Kim, Cheiyol Kim, Sung-In Jung, Hyun-Sup Shin, and Jin-Soo Kim. 2008. Inter-domain socket communications supporting high performance and full binary compatibility on Xen. In Proceedings of the 4th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE’08). 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Aravind Menon, Simon Schubert, and Willy Zwaenepoel. 2009. TwinDrivers: Semi-automatic derivation of fast and safe hypervisor network drivers from guest OS drivers. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’09). 301--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Aravind Menon, Alan L. Cox, and Willy Zwaenepoel. 2006. Optimizing network virtualization in Xen. In Proceedings of the USENIX Annual Technical Conference (ATC’06). 15--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Aravind Menon, Jose Renato Santos, Yoshio Turner, G. (John) Janakiraman, and Willy Zwaenepoel. 2005. Diagnosing performance overheads in the Xen virtual machine environment. In Proceedings of the 1st ACM/USENIX International Conference on Virtual Execution Environments (VEE’05). 13--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Aravind Menon and Willy Zwaenepoel. 2008. Optimizing TCP receive performance. In Proceedings of the USENIX Annual Technical Conference (ATC’08). 85--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Grzegorz Milos, Derek G. Murray, Steven Hand, and Michael Fetterman. 2009. Satori: Enlightened page sharing. In Proceedings of the 2009 Conference on USENIX Annual Technical Conference (ATC’09). 1--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Dave Minturn, Greg Regnier, Jon Krueger, Ravishankar Iyer, and Srihari Makineni. 2003. Addressing TCP/IP processing challenges using the IA and IXP processors. Intel Technol. J. 7, 4, 39--50.Google ScholarGoogle Scholar
  34. MPICH2. 2013. MPICH: High-performance portable MPI. http://www.mpich.org.Google ScholarGoogle Scholar
  35. NETEM. 2009. Linux network emulation. http://www.linuxfoundation.org/collaborate/workgroups/networking/netem.Google ScholarGoogle Scholar
  36. Olio. 2011. Apache Olio. http://incubator.apache.org/olio/.Google ScholarGoogle Scholar
  37. Jon Oltsik and Mark Bowker. 2010. Server virtualization landscape. http://events.1105govinfo.com/ events/vcgsummit2010/information/ /media/GIG/GIG%20Events/2010%20Enterprise%20Architecture/ Presentations 0/VCG10 3%201 Oltsik%20Bowker.ashx.Google ScholarGoogle Scholar
  38. Diego Ongaro, Alan L. Cox, and Scott Rixner. 2008. Scheduling I/O in virtual machine monitors. In Proceedings of the 4th International Conference on Virtual Execution Environments (VEE’08). 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Postel. 1981. Transmission control protocol. RFC 793.Google ScholarGoogle Scholar
  40. Murali Rangarajan, Aniruddha Bohra, Kalpana Banerjee, Enrique V. Carrera, Ricardo Bianchini, and Liviu Iftode. 2002. TCP servers: Offloading TCP processing in Internet servers. Tech. rep. DCS-TR-48, Dept. of Computer Science, Rutgers University, Piscataway, NJ.Google ScholarGoogle Scholar
  41. Greg Regnier, Srihari Makineni, Ramesh Illikkal, Ravi Iyer, Dave Minturn, Ram Huggahalli, Don Newell, Linda Cline, and Annie Foong. 2004. TCP onloading for data center servers. IEEE Computer 37, 11, 48--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat, David G. Andersen, Gregory R. Ganger, Garth A. Gibson, and Brian Mueller. 2009. Safe and effective fine-grained TCP retransmissions for datacenter communication. In Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication (SIGCOMM’09). 303--314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. VMware View. 2010. VMware view architecture planning guide. https://www.vmware.com/pdf/view40 architecture planning.pdf.Google ScholarGoogle Scholar
  44. Guohui Wang and T. S. Eugene Ng. 2010. The impact of virtualization on network performance of Amazon EC2 data center. In Proceedings of the 29th Conference on Information Communications (INFOCOM’10). 1163--1171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Jian Wang, Kwame-Lante Wright, and Kartik Gopalan. 2008. XenLoop: A transparent high performance inter-vm network loopback. In Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC’08). 109--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. David X. Wei, Cheng Jin, Steven H. Low, and Sanjay Hegde. 2006. FAST TCP: Motivation, architecture, algorithms, performance. IEEE/ACM Trans. Network. 14, 6, 1246--1259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif. 2007. Black-box and gray-box strategies for virtual machine migration. In Proceedings of the 4th USENIX Conference on Networked Systems Design and Implementation (NSDI’07). 229--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. XENCREDIT. 2012. Xen credit scheduler. http://wiki.xen.org/wiki/Credit Scheduler.Google ScholarGoogle Scholar
  49. Cong Xu, Sahan Gamage, Pawan N. Rao, Ardalan Kangarlou, Ramana Rao Kompella, and Dongyan Xu. 2012. vSlicer: Latency-aware virtual machine scheduling via differentiated-frequency CPU slicing. In Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing (HPDC’12). 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Xiaolan Zhang, Suzanne McIntosh, Pankaj Rohatgi, and John Linwood Griffin. 2007. XenSocket: A high-throughput interdomain transport for virtual machines. In Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware (Middleware’07). 184--203. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Protocol Responsibility Offloading to Improve TCP Throughput in Virtualized Environments

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Computer Systems
      ACM Transactions on Computer Systems  Volume 31, Issue 3
      August 2013
      94 pages
      ISSN:0734-2071
      EISSN:1557-7333
      DOI:10.1145/2518037
      Issue’s Table of Contents

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 August 2013
      • Accepted: 1 March 2013
      • Revised: 1 October 2012
      • Received: 1 April 2012
      Published in tocs Volume 31, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader