skip to main content
research-article
Public Access

Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan

Authors Info & Claims
Published:29 July 2021Publication History
Skip Abstract Section

Abstract

There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence of many sophisticated interprocedural analyses, few of them have been employed to improve checkers for systems code due to their complex implementations and poor scalability.

In this article, we revisit the scalability problem of interprocedural static analysis from a “Big Data” perspective. That is, we turn sophisticated code analysis into Big Data analytics and leverage novel data processing techniques to solve this traditional programming language problem. We propose Graspan, a disk-based parallel graph system that uses an edge-pair centric computation model to compute dynamic transitive closures on very large program graphs. We develop two backends for Graspan, namely, Graspan-C running on CPUs and Graspan-G on GPUs, and present their designs in the article. Graspan-C can analyze large-scale systems code on any commodity PC, while, if GPUs are available, Graspan-G can be readily used to achieve orders of magnitude speedup by harnessing a GPU’s massive parallelism.

We have implemented fully context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases written in multiple languages such as Linux and Apache Hadoop demonstrates that their Graspan implementations are language-independent, scale to millions of lines of code, and are much simpler than their original implementations. Moreover, we show that these analyses can be used to uncover many real-world bugs in large-scale systems code.

References

  1. 2015. The FindBugs Java Static Checker. Retrieved from http://findbugs.sourceforge.net/.Google ScholarGoogle Scholar
  2. 2016. The Coverity Code Checker. Retrieved from http://www.coverity.com/.Google ScholarGoogle Scholar
  3. 2016. The GrammaTech CodeSonar Static Checker. https://www.grammatech.com/codesonar-cc.Google ScholarGoogle Scholar
  4. 2016. The HP Fortify Static Checker. https://www.microfocus.com/en-us/cyberres/application-security/static-code-analyzer.Google ScholarGoogle Scholar
  5. 2016. The KlocWork Static Checker. https://www.perforce.com/products/klocwork.Google ScholarGoogle Scholar
  6. 2016. The LogicBlox Datalog Engine. Retrieved from http://www.logicblox.com/.Google ScholarGoogle Scholar
  7. 2016. Personal Communication with John Criswell.Google ScholarGoogle Scholar
  8. 2020. The Datalog Engine. Retrieved from http://souffle-lang.github.io/.Google ScholarGoogle Scholar
  9. Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Xuehai Qian, Kang Chen, and Weimin Zheng. 2017. Squeezing out all the value of loaded data: An out-of-core graph processing system with reduced disk I/O. In USENIX ATC. 125–137.Google ScholarGoogle Scholar
  10. Alex Aiken, Suhabe Bugrara, Isil Dillig, Thomas Dillig, Brian Hackett, and Peter Hawkins. 2007. An overview of the saturn project. In PASTE. 43–48.Google ScholarGoogle Scholar
  11. Aws Albarghouthi, Rahul Kumar, Aditya V. Nori, and Sriram K. Rajamani. 2012. Parallelizing top-down interprocedural analyses. In PLDI. ACM, 217–228. DOI:DOI:http://doi.org/10.1145/2254064.2254091Google ScholarGoogle Scholar
  12. Rajeev Alur. 2007. Marrying words and trees. In PODS. 233–242.Google ScholarGoogle Scholar
  13. Rajeev Alur, Michael Benedikt, Kousha Etessami, Patrice Godefroid, Thomas Reps, and Mihalis Yannakakis. 2005. Analysis of recursive state machines. ACM Trans. Program. Lang. Syst. 27, 4 (2005), 786–818.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Rajeev Alur and P. Madhusudan. 2004. Visibly pushdown languages. In STOC. 202–211.Google ScholarGoogle Scholar
  15. M. D. Atkinson, J.-R. Sack, N. Santoro, and T. Strothotte. 1986. Min-max heaps and generalized priority queues. Commun. ACM 29, 10 (1986), 996–1000.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Thomas Ball, Byron Cook, Vladimir Levin, and Sriram K. Rajamani. 2004. SLAM and static driver verifier: Technology transfer of formal methods inside microsoft. In IFM. 1–20.Google ScholarGoogle Scholar
  17. Thomas Ball, Rupak Majumdar, Todd Millstein, and Sriram K. Rajamani. 2001. Automatic predicate abstraction of C programs. In PLDI. 203–213.Google ScholarGoogle Scholar
  18. Osbert Bastani, Saswat Anand, and Alex Aiken. 2015. Specification inference using context-free language reachability. In POPL. 553–566.Google ScholarGoogle Scholar
  19. Thorsten Blaß and Michael Philippsen. 2019. GPU-accelerated fixpoint algorithms for faster compiler analyses. In CC. Association for Computing Machinery, New York, NY, 122–134.Google ScholarGoogle Scholar
  20. Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly declarative specification of sophisticated points-to analyses. In OOPSLA. 243–262.Google ScholarGoogle Scholar
  21. Fraser Brown, Andres Notzli, and Dawson Engler. 2016. How to build static checking systems using orders of magnitude less code. In ASPLOS. 143–157.Google ScholarGoogle Scholar
  22. Yingyi Bu, Vinayak Borkar, Jianfeng Jia, Michael J. Carey, and Tyson Condie. 2014. Pregelix: Big(Ger) graph analytics on a dataflow engine. Proc. VLDB Endow. 8, 2 (Oct. 2014), 161–172.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Suhabe Bugrara and Alex Aiken. 2008. Verifying the safety of user pointer dereferences. In IEEE S&P. 325–338.Google ScholarGoogle Scholar
  24. Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI. 209–224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Cristian Cadar, Vijay Ganesh, Peter M. Pawlowski, David L. Dill, and Dawson R. Engler. 2006. EXE: Automatically generating inputs of death. In CCS. 322–335.Google ScholarGoogle Scholar
  26. Cheng Cai, Qirun Zhang, Zhiqiang Zuo, Khanh Nguyen, Guoqing Xu, and Zhendong Su. 2018. Calling-to-reference context translation via constraint-guided CFL-reachability. In PLDI. Association for Computing Machinery, New York, NY, 196–210. DOI:DOI:http://doi.org/10.1145/3192366.3192378Google ScholarGoogle Scholar
  27. Cristiano Calcagno, Dino Distefano, Jeremy Dubreil, Dominik Gabi, Pieter Hooimeijer, Martino Luca, Peter O’Hearn, Irene Papakonstantinou, Jim Purbrick, and Dulma Rodriguez. 2015. Moving fast with software verification. In NASA Formal Methods. Cham, 3–11.Google ScholarGoogle Scholar
  28. Rong Chen, Xin Ding, Peng Wang, Haibo Chen, Binyu Zang, and Haibing Guan. 2014. Computation and communication efficient graph processing with distributed immutable view. In HPDC. 215–226.Google ScholarGoogle Scholar
  29. Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. In EuroSys. 1:1–1:15.Google ScholarGoogle Scholar
  30. Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson Engler. 2001. An empirical study of operating systems errors. In SOSP. 73–88.Google ScholarGoogle Scholar
  31. Manuvir Das, Sorin Lerner, and Mark Seigle. 2002. ESP: Path-sensitive program verification in polynomial time. In PLDI. 57–68.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Robert DeLine and Manuel Fähndrich. 2001. Enforcing high-level protocols in low-level software. In PLDI. 59–69.Google ScholarGoogle Scholar
  33. Nurit Dor, Stephen Adams, Manuvir Das, and Zhe Yang. 2004. Software validation via scalable path-sensitive value flow analysis. In ISSTA. 12–22.Google ScholarGoogle Scholar
  34. Maryam Emami, Rakesh Ghiya, and Laurie J. Hendren. 1994. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In PLDI. ACM, New York, NY, 242–256. DOI:DOI:http://doi.org/10.1145/178243.178264Google ScholarGoogle Scholar
  35. Dawson Engler. 2011. Making finite verification of raw C code easier than writing a test case. In RV.Google ScholarGoogle Scholar
  36. Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem. 2000. Checking system rules using system-specific, programmer-written compiler extensions. In OSDI. 1–1.Google ScholarGoogle Scholar
  37. Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. 2001. Bugs as deviant behavior: A general approach to inferring errors in systems code. In SOSP. 57–72.Google ScholarGoogle Scholar
  38. Stephen Fink, Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay. 2006. Effective typestate verification in the presence of aliasing. In ISSTA. 133–144.Google ScholarGoogle Scholar
  39. Jeffrey S. Foster, Manuel Fähndrich, and Alexander Aiken. 1999. A theory of type qualifiers. In PLDI. 192–203.Google ScholarGoogle Scholar
  40. Zhisong Fu, Michael Personick, and Bryan Thompson. 2014. MapGraph: A high level API for fast development of high performance graph analytics on GPUs. In GRADES. Association for Computing Machinery, New York, NY, 1–6.Google ScholarGoogle Scholar
  41. Diego Garbervetsky, Edgardo Zoppi, and Benjamin Livshits. 2017. Toward full elasticity in distributed static analysis: The case of callgraph analysis. In ESEC/FSE. ACM, New York, NY, 442–453. DOI:DOI:http://doi.org/10.1145/3106237.3106261Google ScholarGoogle Scholar
  42. Abdullah Gharaibeh, Elizeu Santos-Neto, Lauro Beltrão Costa, and Matei Ripeanu. 2013. Efficient large-scale graph processing on hybrid CPU and GPU systems. CoRR abs/1312.3018 (2013).Google ScholarGoogle Scholar
  43. Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In OSDI. 17–30.Google ScholarGoogle Scholar
  44. Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In OSDI. 599–613.Google ScholarGoogle Scholar
  45. Rong Gu, Zhiqiang Zuo, Xi Jiang, Han Yin, Zhaokang Wang, Linzhang Wang, Xuandong Li, and Yihua Huang. 2021. Towards efficient large-scale interprocedural program static analysis on distributed data-parallel computation. IEEE Trans. Parallel Distrib. Syst. 32, 4 (Apr. 2021), 867–883. DOI:DOI:http://doi.org/10.1109/TPDS.2020.3036190Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Seth Hallem, Benjamin Chelf, Yichen Xie, and Dawson Engler. 2002. A system and language for building system-specific, static analyses. In PLDI. 69–82.Google ScholarGoogle Scholar
  47. Wei Han, Daniel Mawhirter, Bo Wu, and Matthew Buland. 2017. Graphie: Large-scale asynchronous graph traversals on just a GPU. In PACT. 233–245. DOI:DOI:http://doi.org/10.1109/PACT.2017.41Google ScholarGoogle Scholar
  48. Wook-Shin Han, Sangyeon Lee, Kyungyeol Park, Jeong-Hoon Lee, Min-Soo Kim, Jinha Kim, and Hwanjo Yu. 2013. TurboGraph: A fast parallel graph engine handling billion-scale graphs in a single PC. In KDD. 77–85.Google ScholarGoogle Scholar
  49. Sungpack Hong, Sang Kyun Kim, Tayo Oguntebi, and Kunle Olukotun. 2011. Accelerating CUDA graph algorithms at maximum warp. InPPoPP. Association for Computing Machinery, New York, NY, 267–276. DOI:DOI:http://doi.org/10.1145/1941553.1941590Google ScholarGoogle Scholar
  50. Susan Horwitz, Thomas Reps, and Mooly Sagiv. 1995. Demand interprocedural dataflow analysis. In FSE. 104–115.Google ScholarGoogle Scholar
  51. G. F. Italiano. 1986. Amortized efficiency of a path retrieval data structure. Theor. Comput. Sci. 48, 2–3 (1986), 273–281.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Minseok Jeon, Sehun Jeong, and Hakjoo Oh. 2018. Precise and scalable points-to analysis via data-driven context tunneling. Proc. ACM Program. Lang. 2, OOPSLA (Oct. 2018). DOI:DOI:http://doi.org/10.1145/3276510Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Sehun Jeong, Minseok Jeon, Sungdeok Cha, and Hakjoo Oh. 2017. Data-driven context-sensitivity for points-to analysis. Proc. ACM Program. Lang. 1, OOPSLA (Oct. 2017). DOI:DOI:http://doi.org/10.1145/3133924Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Zhihao Jia, Yongkee Kwon, Galen Shipman, Pat McCormick, Mattan Erez, and Alex Aiken. 2017. A distributed multi-GPU system for fast graph processing. Proc. VLDB Endow. 11, 3 (Nov. 2017), 297–310.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Rob Johnson and David Wagner. 2004. Finding user/kernel pointer bugs with type inference. In USENIX Security. 9–9.Google ScholarGoogle Scholar
  56. Herbert Jordan, Bernhard Scholz, and Pavle Subotić. 2016. Soufflé: On synthesis of program analyzers. In Computer Aided Verification, Swarat Chaudhuri and Azadeh Farzan (Eds.). Springer International Publishing, Cham, 422–430.Google ScholarGoogle Scholar
  57. George Kastrinis and Yannis Smaragdakis. 2013. Hybrid context-sensitivity for points-to analysis. In PLDI. 423–434.Google ScholarGoogle Scholar
  58. Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan. 2014. CuSha: Vertex-centric graph processing on GPUs. In HPDC. Association for Computing Machinery, New York, NY, 239–252.Google ScholarGoogle Scholar
  59. Min-Soo Kim, Kyuhyeon An, Himchan Park, Hyunseok Seo, and Jinwook Kim. 2016. GTS: A fast and scalable graph processing method based on streaming topology to GPUs. In SIGMOD. Association for Computing Machinery, New York, NY, 447–461.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. John Kodumal and Alex Aiken. 2004. The set constraint/CFL reachability connection in practice. In PLDI. 207–218.Google ScholarGoogle Scholar
  61. John Kodumal and Alex Aiken. 2007. Regularly annotated set constraints. In PLDI. 331–341.Google ScholarGoogle Scholar
  62. Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. In OSDI. 31–46.Google ScholarGoogle Scholar
  63. Monica S. Lam, Stephen Guo, and Jiwon Seo. 2013. SociaLite: Datalog extensions for efficient social network analysis. In ICDE. 278–289.Google ScholarGoogle Scholar
  64. Butler W. Lampson. 1983. Hints for computer system design. In SOSP. 33–48.Google ScholarGoogle Scholar
  65. Chris Lattner, Andrew Lenharth, and Vikram Adve. 2007. Making context-sensitive points-to analysis with heap cloning practical for the real world. In PLDI. 278–289.Google ScholarGoogle Scholar
  66. Ondřej Lhoták and Laurie Hendren. 2003. Scaling Java points-to analysis using SPARK. InCC. Springer-Verlag, Berlin, 153–169.Google ScholarGoogle Scholar
  67. Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis. 2018. Precision-guided context sensitivity for pointer analysis. Proc. ACM Program. Lang. 2, OOPSLA (Oct. 2018). DOI:DOI:http://doi.org/10.1145/3276511Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis. 2018. Scalability-first pointer analysis with self-tuning context-sensitivity. In ESEC/FSE. ACM, New York, NY, 129–140. DOI:DOI:http://doi.org/10.1145/3236024.3236041Google ScholarGoogle Scholar
  69. Zhenmin Li, Shan Lu, Suvda Myagmar, and Yuanyuan Zhou. 2004. CP-Miner: A tool for finding copy-paste and related bugs in operating system code. In OSDI. 20–20.Google ScholarGoogle Scholar
  70. Zhenmin Li and Yuanyuan Zhou. 2005. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code. In FSE. 306–315.Google ScholarGoogle Scholar
  71. Zhiyuan Lin, Minsuk Kahng, Kaeser Md. Sabrin, Duen Horng (Polo) Chau, Ho Lee, , and U. Kang. 2014. MMap: Fast billion-scale graph computation on a PC via memory mapping. In BigData. 159–164.Google ScholarGoogle Scholar
  72. Ying Liu and Ana Milanova. 2008. Static analysis for inference of explicit information flow. In PASTE. 50–56.Google ScholarGoogle Scholar
  73. Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed GraphLab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5, 8 (2012), 716–727.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Jingbo Lu and Jingling Xue. 2019. Precision-preserving yet fast object-sensitive pointer analysis with partial context sensitivity. Proc. ACM Program. Lang. 3, OOPSLA (Oct. 2019). DOI:DOI:http://doi.org/10.1145/3360574Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Steffen Maass, Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, and Taesoo Kim. 2017. Mosaic: Processing a trillion-edge graph on a single machine. In EuroSys. 527–543.Google ScholarGoogle Scholar
  76. Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski, and Google Inc. 2010. Pregel: A system for large-scale graph processing. In SIGMOD. 135–146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. David Melski and Thomas Reps. 2000. Interconvertibility of a class of set constraints and context-free-language reachability. Theoret. Comput. Sci. 248 (2000), 29–98.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Mario Mendez-Lojo, Martin Burtscher, and Keshav Pingali. 2012. A GPU implementation of inclusion-based points-to analysis. In PPoPP. ACM, 107–116. DOI:DOI:http://doi.org/10.1145/2145816.2145831Google ScholarGoogle Scholar
  79. Mario Méndez-Lojo, Augustine Mathew, and Keshav Pingali. 2010. Parallel inclusion-based points-to analysis. In OOPSLA10. ACM, 428–443. DOI:DOI:http://doi.org/10.1145/1869459.1869495Google ScholarGoogle Scholar
  80. Matthew Might, Yannis Smaragdakis, and David Van Horn. 2010. Resolving and exploiting the k-CFA paradox: Illuminating functional vs. object-oriented program analysis. In PLDI. ACM, New York, NY, 305–315. DOI:DOI:http://doi.org/10.1145/1806596.1806631Google ScholarGoogle Scholar
  81. Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2005. Parameterized object sensitivity for points-to analysis for Java. ACM Trans. Softw. Eng. Methodol. 14, 1 (Jan. 2005), 1–41. DOI:DOI:http://doi.org/10.1145/1044834.1044835Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Brian R. Murphy and Monica S. Lam. 1999. Program analysis with partial transfer functions. In PEPM. ACM, New York, NY, 94–103. DOI:DOI:http://doi.org/10.1145/328690.328703Google ScholarGoogle Scholar
  83. Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A timely dataflow system. In SOSP. 439–455.Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Vaivaswatha Nagaraj and R. Govindarajan. 2013. Parallel flow-sensitive pointer analysis by graph-rewriting. In PACT. IEEE Press, 19–28.Google ScholarGoogle Scholar
  85. George C. Necula, Jeremy Condit, Matthew Harren, Scott McPeak, and Westley Weimer. 2005. CCured: Type-safe retrofitting of legacy software. ACM Trans. Program. Lang. Syst. 27, 3 (2005), 477–526.Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In SOSP. 456–471.Google ScholarGoogle Scholar
  87. Hakjoo Oh, Wonchan Lee, Kihong Heo, Hongseok Yang, and Kwangkeun Yi. 2014. Selective context-sensitivity guided by impact pre-analysis. In PLDI. Association for Computing Machinery, New York, NY, 475–484.Google ScholarGoogle Scholar
  88. Yoann Padioleau, Julia Lawall, René Rydhof Hansen, and Gilles Muller. 2008. Documenting and automating collateral evolutions in linux device drivers. In EuroSys. 247–260.Google ScholarGoogle Scholar
  89. Nicolas Palix, Gaël Thomas, Suman Saha, Christophe Calvès, Julia Lawall, and Gilles Muller. 2011. Faults in Linux: Ten years later. In ASPLOS. 305–318.Google ScholarGoogle Scholar
  90. Y. Pan, Y. Wang, Y. Wu, C. Yang, and J. D. Owens. 2017. Multi-GPU graph analytics. In IPDPS. 479–490.Google ScholarGoogle Scholar
  91. Roger Pearce, Maya Gokhale, and Nancy M. Amato. 2010. Multithreaded asynchronous graph traversal for in-memory and semi-external memory. In SC. 1–11.Google ScholarGoogle Scholar
  92. J. Rehof and M. Fähndrich. 2001. Type-based flow analysis: From polymorphic subtyping to CFL-reachability. In POPL. 54–66.Google ScholarGoogle Scholar
  93. Thomas Reps. 1994. Solving demand versions of interprocedural analysis problems. In CC. 389–403.Google ScholarGoogle Scholar
  94. Tom Reps. 1995. Shape analysis as a generalized path problem. In PEPM. 1–11.Google ScholarGoogle Scholar
  95. Thomas Reps. 1998. Program analysis via graph reachability. Inf. Softw. Technol. 40, 11–12 (1998), 701–726.Google ScholarGoogle ScholarCross RefCross Ref
  96. T. Reps, S. Horwitz, and M. Sagiv. 1995. Precise interprocedural dataflow analysis via graph reachability. In POPL. 49–61.Google ScholarGoogle Scholar
  97. Thomas Reps, Susan Horwitz, Mooly Sagiv, and Genevieve Rosay. 1994. Speeding up slicing. In FSE. 11–20.Google ScholarGoogle Scholar
  98. Liam Roditty and Uri Zwick. 2004. A fully dynamic reachability algorithm for directed graphs with an almost linear update time. In STOC. 184–191.Google ScholarGoogle Scholar
  99. Jonathan Rodriguez and Ondřej Lhoták. 2011. Actor-based parallel dataflow analysis. In CC/ETAPS. 179–197.Google ScholarGoogle Scholar
  100. Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel. 2015. Chaos: Scale-out graph processing from secondary storage. In SOSP. 410–424.Google ScholarGoogle Scholar
  101. Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-Stream: Edge-centric graph processing using streaming partitions. In SOSP. 472–488.Google ScholarGoogle Scholar
  102. Cindy Rubio-González, Haryadi S. Gunawi, Ben Liblit, Remzi H. Arpaci-Dusseau, and Andrea C. Arpaci-Dusseau. 2009. Error propagation analysis for file systems. In PLDI. 270–280.Google ScholarGoogle Scholar
  103. Cindy Rubio-González and Ben Liblit. 2011. Defective error/pointer interactions in the Linux kernel. In ISSTA. 111–121.Google ScholarGoogle Scholar
  104. Caitlin Sadowski, Edward Aftandilian, Alex Eagle, Liam Miller-Cushon, and Ciera Jaspan. 2018. Lessons from building static analysis tools at Google. Commun. ACM 61, 4 (Mar. 2018), 58–66. DOI:DOI:http://doi.org/10.1145/3188720Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. Mooly Sagiv, Thomas Reps, and Susan Horwitz. 1996. Precise interprocedural dataflow analysis with applications to constant propagation. Theoret. Comput. Sci. 167, 1–2 (1996), 131–170.Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. Dipanjan Sengupta, Shuaiwen Leon Song, Kapil Agarwal, and Karsten Schwan. 2015. GraphReduce: Processing large-scale graphs on accelerator-based systems. In SC15. Association for Computing Machinery, New York, NY. DOI:DOI:http://doi.org/10.1145/2807591.2807655Google ScholarGoogle Scholar
  107. M. Sharir and A. Pnueli. 1981. Two approaches to interprocedural data flow analysis. In Program Flow Analysis: Theory and Applications, S. Muchnick and N. Jones (Eds.). Prentice Hall, 189–234.Google ScholarGoogle Scholar
  108. Jiaxin Shi, Youyang Yao, Rong Chen, Haibo Chen, and Feifei Li. 2016. Fast and concurrent RDF queries with RDMA-based distributed graph exploration. In USENIX ATC. 317–332.Google ScholarGoogle Scholar
  109. Qingkai Shi, Xiao Xiao, Rongxin Wu, Jinguo Zhou, Gang Fan, and Charles Zhang. 2018. Pinpoint: Fast and precise sparse value flow analysis for million lines of code. In PLDI. ACM, 693–706. DOI:DOI:http://doi.org/10.1145/3192366.3192418Google ScholarGoogle Scholar
  110. X. Shi, X. Luo, J. Liang, P. Zhao, S. Di, B. He, and H. Jin. 2018. Frog: Asynchronous graph processing on GPU with hybrid coloring model. IEEE Trans. Knowl. Data Eng. 30, 1 (2018), 29–42.Google ScholarGoogle ScholarCross RefCross Ref
  111. Alexander Shkapsky, Mohan Yang, Matteo Interlandi, Hsuan Chiu, Tyson Condie, and Carlo Zaniolo. 2016. Big data analytics with datalog queries on spark. In SIGMOD. 1135–1149.Google ScholarGoogle Scholar
  112. Julian Shun and Guy E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In PPoPP. 135–146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. Yannis Smaragdakis, George Balatsouras, and George Kastrinis. 2013. Set-based pre-processing for points-to analysis. In OOPSLA. ACM, New York, NY, 253–270. DOI:DOI:http://doi.org/10.1145/2509136.2509524Google ScholarGoogle Scholar
  114. Yannis Smaragdakis, Martin Bravenboer, and Ondrej Lhoták. 2011. Pick your contexts well: Understanding object-sensitivity. In POPL. 17–30.Google ScholarGoogle Scholar
  115. Yannis Smaragdakis, George Kastrinis, and George Balatsouras. 2014. Introspective analysis: Context-sensitivity, across the board. In PLDI. 485–495.Google ScholarGoogle Scholar
  116. Manu Sridharan and Rastislav Bodik. 2006. Refinement-based context-sensitive points-to analysis for Java. In PLDI. 387–400.Google ScholarGoogle Scholar
  117. Manu Sridharan, Denis Gopan, Lexin Shan, and Rastislav Bodik. 2005. Demand-driven points-to analysis for Java. In OOPSLA. 59–76.Google ScholarGoogle Scholar
  118. R. E. Strom and S. Yemini. 1986. Typestate: A programming language concept for enhancing software reliability. IEEE Trans. Softw. Eng. 12, 1 (Jan. 1986), 157–171. DOI:DOI:http://doi.org/10.1109/TSE.1986.6312929Google ScholarGoogle Scholar
  119. Yu Su, Ding Ye, and Jingling Xue. 2013. Accelerating inclusion-based pointer analysis on heterogeneous CPU-GPU systems. In HiPC. 149–158. DOI:DOI:http://doi.org/10.1109/HiPC.2013.6799110Google ScholarGoogle Scholar
  120. Yu Su, Ding Ye, and Jingling Xue. 2014. Parallel pointer analysis with CFL-reachability. In ICPP. 451–460. DOI:DOI:http://doi.org/10.1109/ICPP.2014.54Google ScholarGoogle Scholar
  121. Hao Tang, Xiaoyin Wang, Lingming Zhang, Bing Xie, Lu Zhang, and Hong Mei. 2015. Summary-based context-sensitive data-dependence analysis in presence of callbacks. In POPL. 83–95.Google ScholarGoogle Scholar
  122. Keval Vora, Rajiv Gupta, and Guoqing Xu. 2016. Synergistic analysis of evolving graphs. ACM Trans. Archit. Code Optim. 13, 4 (2016).Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. Keval Vora, Rajiv Gupta, and Guoqing Xu. 2017. KickStarter: Fast and accurate computations on streaming graphs via trimmed approximations. In ASPLOS.Google ScholarGoogle Scholar
  124. Keval Vora, Guoqing Xu, and Rajiv Gupta. 2016. Load the edges you need: A generic I/O optimization for disk-based graph processing. In USENIX ATC. 507–522.Google ScholarGoogle Scholar
  125. Jingjing Wang, Magdalena Balazinska, and Daniel Halperin. 2015. Asynchronous and fault-tolerant recursive datalog evaluation in shared-nothing engines. PVLDB 8, 12 (2015), 1542–1553.Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. Kai Wang, Aftab Hussain, Zhiqiang Zuo, Guoqing Xu, and Ardalan Amiri Sani. 2017. Graspan: A single-machine disk-based graph system for interprocedural static analyses of large-scale systems code. In ASPLOS. 389–404. DOI:DOI:http://doi.org/10.1145/3037697.3037744Google ScholarGoogle Scholar
  127. Kai Wang, Guoqing Xu, Zhendong Su, and Yu David Liu. 2015. GraphQ: Graph query processing with abstraction refinement—programmable and budget-aware analytical queries over very large graphs on a single PC. In USENIX ATC. 387–401.Google ScholarGoogle Scholar
  128. Kai Wang, Zhiqiang Zuo, John Thorpe, Tien Quang Nguyen, and Guoqing Harry Xu. 2018. RStream: Marrying relational algebra with streaming for efficient graph mining on a single machine. In OSDI’18. USENIX Association, 763–782.Google ScholarGoogle Scholar
  129. Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In SOSP. 260–275.Google ScholarGoogle Scholar
  130. Yangzihao Wang, Yuechao Pan, Andrew Davidson, Yuduo Wu, Carl Yang, Leyuan Wang, Muhammad Osama, Chenshan Yuan, Weitang Liu, Andy T. Riffel, and John D. Owens. 2017. Gunrock: GPU graph analytics. ACM Trans. Parallel Comput. 4, 1 (Aug. 2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  131. Cathrin Weiss, Cindy Rubio-González, and Ben Liblit. 2015. Database-backed program analysis for scalable error propagation. In ICSE. 586–597.Google ScholarGoogle Scholar
  132. John Whaley and Monica Lam. 2004. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI. 131–144.Google ScholarGoogle Scholar
  133. Robert P. Wilson and Monica S. Lam. 1995. Efficient context-sensitive pointer analysis for C programs. In PLDI. ACM, New York, NY, 1–12. DOI:DOI:http://doi.org/10.1145/207110.207111Google ScholarGoogle Scholar
  134. Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, and Lidong Zhou. 2015. GraM: Scaling graph computation to the trillions. In SoCC. 408–421.Google ScholarGoogle Scholar
  135. Guoqing Xu and Atanas Rountev. 2008. Merging equivalent contexts for scalable heap-cloning-based context-sensitive points-to analysis. In ISSTA.Google ScholarGoogle Scholar
  136. Guoqing Xu, Atanas Rountev, and Manu Sridharan. 2009. Scaling CFL-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In ECOOP. 98–122.Google ScholarGoogle Scholar
  137. Guoqing Xu, Dacong Yan, and Atanas Rountev. 2012. Static detection of loop-invariant data structures. In ECOOP 2012 – Object-Oriented Programming, James Noble (Ed.). Springer Berlin, 738–763.Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. Dacong Yan, Guoqing Xu, and Atanas Rountev. 2011. Demand-driven context-sensitive alias analysis for Java. In ISSTA. 155–165.Google ScholarGoogle Scholar
  139. Junfeng Yang, Can Sar, and Dawson Engler. 2006. EXPLODE: A lightweight, general system for finding serious storage system errors. In OSDI. 10–10.Google ScholarGoogle Scholar
  140. Mihalis Yannakakis. 1990. Graph-theoretic methods in database theory. In PODS.Google ScholarGoogle Scholar
  141. Daniel M. Yellin. 1993. Speeding up dynamic transitive closure for bounded degree graphs. Acta Inf. 30, 4 (1993), 369–384.Google ScholarGoogle ScholarDigital LibraryDigital Library
  142. S. Yong, S. Horwitz, and T. Reps. 1999. Pointer analysis for programs with structures and casting. In PLDI. 91–103.Google ScholarGoogle Scholar
  143. Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm. 2014. Simple testing can prevent most critical failures: An analysis of production failures in distributed data-intensive systems. In OSDI. USENIX Association, 249–265. Retrieved from https://www.usenix.org/conference/osdi14/technical-sessions/presentation/yuan.Google ScholarGoogle Scholar
  144. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. InNSDI. USENIX Association, 2.Google ScholarGoogle ScholarDigital LibraryDigital Library
  145. Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li, and Weimin Zheng. 2016. Exploring the hidden dimension in graph processing. In OSDI. 285–300.Google ScholarGoogle Scholar
  146. Qirun Zhang, Michael R. Lyu, Hao Yuan, and Zhendong Su. 2013. Fast algorithms for Dyck-CFL-reachability with applications to alias analysis. In PLDI. 435–446.Google ScholarGoogle Scholar
  147. Qirun Zhang and Zhendong Su. 2017. Context-sensitive data dependence analysis via linear conjunctive language reachability. In POPL. 344–358.Google ScholarGoogle Scholar
  148. Qirun Zhang, Xiao Xiao, Charles Zhang, Hao Yuan, and Zhendong Su. 2014. Efficient subcubic alias analysis for C. In OOPSLA. 829–845.Google ScholarGoogle Scholar
  149. Xin Zhang, Ravi Mangal, Radu Grigore, Mayur Naik, and Hongseok Yang. 2014. On abstraction refinement for program analyses in datalog. In PLDI. ACM, New York, NY, 239–248. DOI:DOI:https://doi.org/10.1145/2594291.2594327Google ScholarGoogle Scholar
  150. Jisheng Zhao, Michael G. Burke, and Vivek Sarkar. 2018. Parallel sparse flow-sensitive points-to analysis. InCC. Association for Computing Machinery, New York, NY, 59–70.Google ScholarGoogle Scholar
  151. Yue Zhao, Guoyang Chen, Chunhua Liao, and Xipeng Shen. 2016. Towards ontology-based program analysis. In ECOOP (Leibniz International Proceedings in Informatics (LIPIcs)), Shriram Krishnamurthi and Benjamin S. Lerner (Eds.), Vol. 56. Dagstuhl, Germany, 26:1–26:25. DOI:DOI:http://doi.org/10.4230/LIPIcs.ECOOP.2016.26Google ScholarGoogle Scholar
  152. Da Zheng, Disa Mhembere, Randal Burns, Joshua Vogelstein, Carey E Priebe, and Alexander S. Szalay. 2015. FlashGraph: Processing billion-node graphs on an array of commodity SSDs. In FAST. 45–58.Google ScholarGoogle Scholar
  153. Xin Zheng and Radu Rugina. 2008. Demand-driven alias analysis for C. In POPL. 197–208.Google ScholarGoogle Scholar
  154. Jianlong Zhong and Bingsheng He. 2014. Medusa: Simplified graph processing on GPUs. IEEE Trans. Parallel Distrib. Syst. 25, 6 (June 2014), 1543–1552. DOI:DOI:http://doi.org/10.1109/TPDS.2013.111Google ScholarGoogle ScholarDigital LibraryDigital Library
  155. Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In OSDI. 301–316.Google ScholarGoogle Scholar
  156. Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX ATC. 375–386.Google ScholarGoogle Scholar
  157. Zhiqiang Zuo, Rong Gu, Xi Jiang, Zhaokang Wang, Yihua Huang, Linzhang Wang, and Xuandong Li. 2019. BigSpa: An efficient interprocedural static analysis engine in the cloud. In IPDPS.Google ScholarGoogle Scholar
  158. Zhiqiang Zuo, John Thorpe, Yifei Wang, Qiuhong Pan, Shenming Lu, Kai Wang, Harry Xu, Linzhang Wang, and Xuandong Li. 2019. Grapple: A graph system for static finite-state property checking of large-scale systems code. In EuroSys. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  159. Zhiqiang Zuo, Yiyu Zhang, Qiuhong Pan, Shenming Lu, Yue Li, Linzhang Wang, Xuandong Li, and Guoqing Harry Xu. 2021. Chianina: An evolving graph system for flow- and context-sensitive analyses of million lines of C code. In PLDI. Association for Computing Machinery, New York, NY. DOI:DOI:http://doi.org/10.1145/3453483.3454085Google ScholarGoogle Scholar

Index Terms

  1. Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Computer Systems
            ACM Transactions on Computer Systems  Volume 38, Issue 1-2
            May 2020
            178 pages
            ISSN:0734-2071
            EISSN:1557-7333
            DOI:10.1145/3474395
            Issue’s Table of Contents

            Copyright © 2021 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 29 July 2021
            • Accepted: 1 May 2021
            • Revised: 1 February 2021
            • Received: 1 August 2020
            Published in tocs Volume 38, Issue 1-2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format