research-article

Public Access

Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan

Authors:
Zhiqiang Zuo

State Key Laboratory for Novel Software Technology, Nanjing University, China

State Key Laboratory for Novel Software Technology, Nanjing University, China
View Profile

,
Kai Wang

University of California, Los Angeles, USA

University of California, Los Angeles, USA
View Profile

,
Aftab Hussain

University of California, Irvine, USA

University of California, Irvine, USA
View Profile

,
Ardalan Amiri Sani

University of California, Irvine, USA

University of California, Irvine, USA
View Profile

,
Yiyu Zhang

State Key Laboratory for Novel Software Technology, Nanjing University, China

State Key Laboratory for Novel Software Technology, Nanjing University, China
View Profile

,
Shenming Lu

State Key Laboratory for Novel Software Technology, Nanjing University, China

State Key Laboratory for Novel Software Technology, Nanjing University, China
View Profile

,
Wensheng Dou

University of Chinese Academy of Sciences and State Key Lab of Computer Sciences, Institute of Software, Chinese Academy of Sciences, China

University of Chinese Academy of Sciences and State Key Lab of Computer Sciences, Institute of Software, Chinese Academy of Sciences, China
View Profile

,
Linzhang Wang

State Key Laboratory for Novel Software Technology,Nanjing University, China

State Key Laboratory for Novel Software Technology,Nanjing University, China
View Profile

,
Xuandong Li

State Key Laboratory for Novel Software Technology,Nanjing University, China

State Key Laboratory for Novel Software Technology,Nanjing University, China
View Profile

,
Chenxi Wang

University of California, Los Angeles, USA

University of California, Los Angeles, USA
View Profile

,
Guoqing Harry Xu

University of California, Los Angeles, USA

University of California, Los Angeles, USA
View Profile

Authors Info & Claims

ACM Transactions on Computer Systems Volume 38 Issue 1-2Article No.: 4pp 1–39https://doi.org/10.1145/3466820

Published:29 July 2021Publication History

ACM Transactions on Computer Systems

Abstract

There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence of many sophisticated interprocedural analyses, few of them have been employed to improve checkers for systems code due to their complex implementations and poor scalability.

In this article, we revisit the scalability problem of interprocedural static analysis from a “Big Data” perspective. That is, we turn sophisticated code analysis into Big Data analytics and leverage novel data processing techniques to solve this traditional programming language problem. We propose Graspan, a disk-based parallel graph system that uses an edge-pair centric computation model to compute dynamic transitive closures on very large program graphs. We develop two backends for Graspan, namely, Graspan-C running on CPUs and Graspan-G on GPUs, and present their designs in the article. Graspan-C can analyze large-scale systems code on any commodity PC, while, if GPUs are available, Graspan-G can be readily used to achieve orders of magnitude speedup by harnessing a GPU’s massive parallelism.

We have implemented fully context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases written in multiple languages such as Linux and Apache Hadoop demonstrates that their Graspan implementations are language-independent, scale to millions of lines of code, and are much simpler than their original implementations. Moreover, we show that these analyses can be used to uncover many real-world bugs in large-scale systems code.

References

2015. The FindBugs Java Static Checker. Retrieved from http://findbugs.sourceforge.net/.Google Scholar
2016. The Coverity Code Checker. Retrieved from http://www.coverity.com/.Google Scholar
2016. The GrammaTech CodeSonar Static Checker. https://www.grammatech.com/codesonar-cc.Google Scholar
2016. The HP Fortify Static Checker. https://www.microfocus.com/en-us/cyberres/application-security/static-code-analyzer.Google Scholar
2016. The KlocWork Static Checker. https://www.perforce.com/products/klocwork.Google Scholar
2016. The LogicBlox Datalog Engine. Retrieved from http://www.logicblox.com/.Google Scholar
2016. Personal Communication with John Criswell.Google Scholar
2020. The Datalog Engine. Retrieved from http://souffle-lang.github.io/.Google Scholar
Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Xuehai Qian, Kang Chen, and Weimin Zheng. 2017. Squeezing out all the value of loaded data: An out-of-core graph processing system with reduced disk I/O. In USENIX ATC. 125–137.Google Scholar
Alex Aiken, Suhabe Bugrara, Isil Dillig, Thomas Dillig, Brian Hackett, and Peter Hawkins. 2007. An overview of the saturn project. In PASTE. 43–48.Google Scholar
Aws Albarghouthi, Rahul Kumar, Aditya V. Nori, and Sriram K. Rajamani. 2012. Parallelizing top-down interprocedural analyses. In PLDI. ACM, 217–228. DOI:DOI:http://doi.org/10.1145/2254064.2254091Google Scholar
Rajeev Alur. 2007. Marrying words and trees. In PODS. 233–242.Google Scholar
Rajeev Alur, Michael Benedikt, Kousha Etessami, Patrice Godefroid, Thomas Reps, and Mihalis Yannakakis. 2005. Analysis of recursive state machines. ACM Trans. Program. Lang. Syst. 27, 4 (2005), 786–818.Google ScholarDigital Library
Rajeev Alur and P. Madhusudan. 2004. Visibly pushdown languages. In STOC. 202–211.Google Scholar
M. D. Atkinson, J.-R. Sack, N. Santoro, and T. Strothotte. 1986. Min-max heaps and generalized priority queues. Commun. ACM 29, 10 (1986), 996–1000.Google ScholarDigital Library
Thomas Ball, Byron Cook, Vladimir Levin, and Sriram K. Rajamani. 2004. SLAM and static driver verifier: Technology transfer of formal methods inside microsoft. In IFM. 1–20.Google Scholar
Thomas Ball, Rupak Majumdar, Todd Millstein, and Sriram K. Rajamani. 2001. Automatic predicate abstraction of C programs. In PLDI. 203–213.Google Scholar
Osbert Bastani, Saswat Anand, and Alex Aiken. 2015. Specification inference using context-free language reachability. In POPL. 553–566.Google Scholar
Thorsten Blaß and Michael Philippsen. 2019. GPU-accelerated fixpoint algorithms for faster compiler analyses. In CC. Association for Computing Machinery, New York, NY, 122–134.Google Scholar
Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly declarative specification of sophisticated points-to analyses. In OOPSLA. 243–262.Google Scholar
Fraser Brown, Andres Notzli, and Dawson Engler. 2016. How to build static checking systems using orders of magnitude less code. In ASPLOS. 143–157.Google Scholar
Yingyi Bu, Vinayak Borkar, Jianfeng Jia, Michael J. Carey, and Tyson Condie. 2014. Pregelix: Big(Ger) graph analytics on a dataflow engine. Proc. VLDB Endow. 8, 2 (Oct. 2014), 161–172.Google ScholarDigital Library
Suhabe Bugrara and Alex Aiken. 2008. Verifying the safety of user pointer dereferences. In IEEE S&P. 325–338.Google Scholar
Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI. 209–224.Google ScholarDigital Library
Cristian Cadar, Vijay Ganesh, Peter M. Pawlowski, David L. Dill, and Dawson R. Engler. 2006. EXE: Automatically generating inputs of death. In CCS. 322–335.Google Scholar
Cheng Cai, Qirun Zhang, Zhiqiang Zuo, Khanh Nguyen, Guoqing Xu, and Zhendong Su. 2018. Calling-to-reference context translation via constraint-guided CFL-reachability. In PLDI. Association for Computing Machinery, New York, NY, 196–210. DOI:DOI:http://doi.org/10.1145/3192366.3192378Google Scholar
Cristiano Calcagno, Dino Distefano, Jeremy Dubreil, Dominik Gabi, Pieter Hooimeijer, Martino Luca, Peter O’Hearn, Irene Papakonstantinou, Jim Purbrick, and Dulma Rodriguez. 2015. Moving fast with software verification. In NASA Formal Methods. Cham, 3–11.Google Scholar
Rong Chen, Xin Ding, Peng Wang, Haibo Chen, Binyu Zang, and Haibing Guan. 2014. Computation and communication efficient graph processing with distributed immutable view. In HPDC. 215–226.Google Scholar
Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. In EuroSys. 1:1–1:15.Google Scholar
Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson Engler. 2001. An empirical study of operating systems errors. In SOSP. 73–88.Google Scholar
Manuvir Das, Sorin Lerner, and Mark Seigle. 2002. ESP: Path-sensitive program verification in polynomial time. In PLDI. 57–68.Google ScholarDigital Library
Robert DeLine and Manuel Fähndrich. 2001. Enforcing high-level protocols in low-level software. In PLDI. 59–69.Google Scholar
Nurit Dor, Stephen Adams, Manuvir Das, and Zhe Yang. 2004. Software validation via scalable path-sensitive value flow analysis. In ISSTA. 12–22.Google Scholar
Maryam Emami, Rakesh Ghiya, and Laurie J. Hendren. 1994. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In PLDI. ACM, New York, NY, 242–256. DOI:DOI:http://doi.org/10.1145/178243.178264Google Scholar
Dawson Engler. 2011. Making finite verification of raw C code easier than writing a test case. In RV.Google Scholar
Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem. 2000. Checking system rules using system-specific, programmer-written compiler extensions. In OSDI. 1–1.Google Scholar
Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. 2001. Bugs as deviant behavior: A general approach to inferring errors in systems code. In SOSP. 57–72.Google Scholar
Stephen Fink, Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay. 2006. Effective typestate verification in the presence of aliasing. In ISSTA. 133–144.Google Scholar
Jeffrey S. Foster, Manuel Fähndrich, and Alexander Aiken. 1999. A theory of type qualifiers. In PLDI. 192–203.Google Scholar
Zhisong Fu, Michael Personick, and Bryan Thompson. 2014. MapGraph: A high level API for fast development of high performance graph analytics on GPUs. In GRADES. Association for Computing Machinery, New York, NY, 1–6.Google Scholar
Diego Garbervetsky, Edgardo Zoppi, and Benjamin Livshits. 2017. Toward full elasticity in distributed static analysis: The case of callgraph analysis. In ESEC/FSE. ACM, New York, NY, 442–453. DOI:DOI:http://doi.org/10.1145/3106237.3106261Google Scholar
Abdullah Gharaibeh, Elizeu Santos-Neto, Lauro Beltrão Costa, and Matei Ripeanu. 2013. Efficient large-scale graph processing on hybrid CPU and GPU systems. CoRR abs/1312.3018 (2013).Google Scholar
Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In OSDI. 17–30.Google Scholar
Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In OSDI. 599–613.Google Scholar
Rong Gu, Zhiqiang Zuo, Xi Jiang, Han Yin, Zhaokang Wang, Linzhang Wang, Xuandong Li, and Yihua Huang. 2021. Towards efficient large-scale interprocedural program static analysis on distributed data-parallel computation. IEEE Trans. Parallel Distrib. Syst. 32, 4 (Apr. 2021), 867–883. DOI:DOI:http://doi.org/10.1109/TPDS.2020.3036190Google ScholarDigital Library
Seth Hallem, Benjamin Chelf, Yichen Xie, and Dawson Engler. 2002. A system and language for building system-specific, static analyses. In PLDI. 69–82.Google Scholar
Wei Han, Daniel Mawhirter, Bo Wu, and Matthew Buland. 2017. Graphie: Large-scale asynchronous graph traversals on just a GPU. In PACT. 233–245. DOI:DOI:http://doi.org/10.1109/PACT.2017.41Google Scholar
Wook-Shin Han, Sangyeon Lee, Kyungyeol Park, Jeong-Hoon Lee, Min-Soo Kim, Jinha Kim, and Hwanjo Yu. 2013. TurboGraph: A fast parallel graph engine handling billion-scale graphs in a single PC. In KDD. 77–85.Google Scholar
Sungpack Hong, Sang Kyun Kim, Tayo Oguntebi, and Kunle Olukotun. 2011. Accelerating CUDA graph algorithms at maximum warp. InPPoPP. Association for Computing Machinery, New York, NY, 267–276. DOI:DOI:http://doi.org/10.1145/1941553.1941590Google Scholar
Susan Horwitz, Thomas Reps, and Mooly Sagiv. 1995. Demand interprocedural dataflow analysis. In FSE. 104–115.Google Scholar
G. F. Italiano. 1986. Amortized efficiency of a path retrieval data structure. Theor. Comput. Sci. 48, 2–3 (1986), 273–281.Google ScholarDigital Library
Minseok Jeon, Sehun Jeong, and Hakjoo Oh. 2018. Precise and scalable points-to analysis via data-driven context tunneling. Proc. ACM Program. Lang. 2, OOPSLA (Oct. 2018). DOI:DOI:http://doi.org/10.1145/3276510Google ScholarDigital Library
Sehun Jeong, Minseok Jeon, Sungdeok Cha, and Hakjoo Oh. 2017. Data-driven context-sensitivity for points-to analysis. Proc. ACM Program. Lang. 1, OOPSLA (Oct. 2017). DOI:DOI:http://doi.org/10.1145/3133924Google ScholarDigital Library
Zhihao Jia, Yongkee Kwon, Galen Shipman, Pat McCormick, Mattan Erez, and Alex Aiken. 2017. A distributed multi-GPU system for fast graph processing. Proc. VLDB Endow. 11, 3 (Nov. 2017), 297–310.Google ScholarDigital Library
Rob Johnson and David Wagner. 2004. Finding user/kernel pointer bugs with type inference. In USENIX Security. 9–9.Google Scholar
Herbert Jordan, Bernhard Scholz, and Pavle Subotić. 2016. Soufflé: On synthesis of program analyzers. In Computer Aided Verification, Swarat Chaudhuri and Azadeh Farzan (Eds.). Springer International Publishing, Cham, 422–430.Google Scholar
George Kastrinis and Yannis Smaragdakis. 2013. Hybrid context-sensitivity for points-to analysis. In PLDI. 423–434.Google Scholar
Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan. 2014. CuSha: Vertex-centric graph processing on GPUs. In HPDC. Association for Computing Machinery, New York, NY, 239–252.Google Scholar
Min-Soo Kim, Kyuhyeon An, Himchan Park, Hyunseok Seo, and Jinwook Kim. 2016. GTS: A fast and scalable graph processing method based on streaming topology to GPUs. In SIGMOD. Association for Computing Machinery, New York, NY, 447–461.Google ScholarDigital Library
John Kodumal and Alex Aiken. 2004. The set constraint/CFL reachability connection in practice. In PLDI. 207–218.Google Scholar
John Kodumal and Alex Aiken. 2007. Regularly annotated set constraints. In PLDI. 331–341.Google Scholar
Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. In OSDI. 31–46.Google Scholar
Monica S. Lam, Stephen Guo, and Jiwon Seo. 2013. SociaLite: Datalog extensions for efficient social network analysis. In ICDE. 278–289.Google Scholar
Butler W. Lampson. 1983. Hints for computer system design. In SOSP. 33–48.Google Scholar
Chris Lattner, Andrew Lenharth, and Vikram Adve. 2007. Making context-sensitive points-to analysis with heap cloning practical for the real world. In PLDI. 278–289.Google Scholar
Ondřej Lhoták and Laurie Hendren. 2003. Scaling Java points-to analysis using SPARK. InCC. Springer-Verlag, Berlin, 153–169.Google Scholar
Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis. 2018. Precision-guided context sensitivity for pointer analysis. Proc. ACM Program. Lang. 2, OOPSLA (Oct. 2018). DOI:DOI:http://doi.org/10.1145/3276511Google ScholarDigital Library
Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis. 2018. Scalability-first pointer analysis with self-tuning context-sensitivity. In ESEC/FSE. ACM, New York, NY, 129–140. DOI:DOI:http://doi.org/10.1145/3236024.3236041Google Scholar
Zhenmin Li, Shan Lu, Suvda Myagmar, and Yuanyuan Zhou. 2004. CP-Miner: A tool for finding copy-paste and related bugs in operating system code. In OSDI. 20–20.Google Scholar
Zhenmin Li and Yuanyuan Zhou. 2005. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code. In FSE. 306–315.Google Scholar
Zhiyuan Lin, Minsuk Kahng, Kaeser Md. Sabrin, Duen Horng (Polo) Chau, Ho Lee, , and U. Kang. 2014. MMap: Fast billion-scale graph computation on a PC via memory mapping. In BigData. 159–164.Google Scholar
Ying Liu and Ana Milanova. 2008. Static analysis for inference of explicit information flow. In PASTE. 50–56.Google Scholar
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed GraphLab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5, 8 (2012), 716–727.Google ScholarDigital Library
Jingbo Lu and Jingling Xue. 2019. Precision-preserving yet fast object-sensitive pointer analysis with partial context sensitivity. Proc. ACM Program. Lang. 3, OOPSLA (Oct. 2019). DOI:DOI:http://doi.org/10.1145/3360574Google ScholarDigital Library
Steffen Maass, Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, and Taesoo Kim. 2017. Mosaic: Processing a trillion-edge graph on a single machine. In EuroSys. 527–543.Google Scholar
Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski, and Google Inc. 2010. Pregel: A system for large-scale graph processing. In SIGMOD. 135–146.Google ScholarDigital Library
David Melski and Thomas Reps. 2000. Interconvertibility of a class of set constraints and context-free-language reachability. Theoret. Comput. Sci. 248 (2000), 29–98.Google ScholarDigital Library
Mario Mendez-Lojo, Martin Burtscher, and Keshav Pingali. 2012. A GPU implementation of inclusion-based points-to analysis. In PPoPP. ACM, 107–116. DOI:DOI:http://doi.org/10.1145/2145816.2145831Google Scholar
Mario Méndez-Lojo, Augustine Mathew, and Keshav Pingali. 2010. Parallel inclusion-based points-to analysis. In OOPSLA10. ACM, 428–443. DOI:DOI:http://doi.org/10.1145/1869459.1869495Google Scholar
Matthew Might, Yannis Smaragdakis, and David Van Horn. 2010. Resolving and exploiting the k-CFA paradox: Illuminating functional vs. object-oriented program analysis. In PLDI. ACM, New York, NY, 305–315. DOI:DOI:http://doi.org/10.1145/1806596.1806631Google Scholar
Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2005. Parameterized object sensitivity for points-to analysis for Java. ACM Trans. Softw. Eng. Methodol. 14, 1 (Jan. 2005), 1–41. DOI:DOI:http://doi.org/10.1145/1044834.1044835Google ScholarDigital Library
Brian R. Murphy and Monica S. Lam. 1999. Program analysis with partial transfer functions. In PEPM. ACM, New York, NY, 94–103. DOI:DOI:http://doi.org/10.1145/328690.328703Google Scholar
Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A timely dataflow system. In SOSP. 439–455.Google ScholarDigital Library
Vaivaswatha Nagaraj and R. Govindarajan. 2013. Parallel flow-sensitive pointer analysis by graph-rewriting. In PACT. IEEE Press, 19–28.Google Scholar
George C. Necula, Jeremy Condit, Matthew Harren, Scott McPeak, and Westley Weimer. 2005. CCured: Type-safe retrofitting of legacy software. ACM Trans. Program. Lang. Syst. 27, 3 (2005), 477–526.Google ScholarDigital Library
Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In SOSP. 456–471.Google Scholar
Hakjoo Oh, Wonchan Lee, Kihong Heo, Hongseok Yang, and Kwangkeun Yi. 2014. Selective context-sensitivity guided by impact pre-analysis. In PLDI. Association for Computing Machinery, New York, NY, 475–484.Google Scholar
Yoann Padioleau, Julia Lawall, René Rydhof Hansen, and Gilles Muller. 2008. Documenting and automating collateral evolutions in linux device drivers. In EuroSys. 247–260.Google Scholar
Nicolas Palix, Gaël Thomas, Suman Saha, Christophe Calvès, Julia Lawall, and Gilles Muller. 2011. Faults in Linux: Ten years later. In ASPLOS. 305–318.Google Scholar
Y. Pan, Y. Wang, Y. Wu, C. Yang, and J. D. Owens. 2017. Multi-GPU graph analytics. In IPDPS. 479–490.Google Scholar
Roger Pearce, Maya Gokhale, and Nancy M. Amato. 2010. Multithreaded asynchronous graph traversal for in-memory and semi-external memory. In SC. 1–11.Google Scholar
J. Rehof and M. Fähndrich. 2001. Type-based flow analysis: From polymorphic subtyping to CFL-reachability. In POPL. 54–66.Google Scholar
Thomas Reps. 1994. Solving demand versions of interprocedural analysis problems. In CC. 389–403.Google Scholar
Tom Reps. 1995. Shape analysis as a generalized path problem. In PEPM. 1–11.Google Scholar
Thomas Reps. 1998. Program analysis via graph reachability. Inf. Softw. Technol. 40, 11–12 (1998), 701–726.Google ScholarCross Ref
T. Reps, S. Horwitz, and M. Sagiv. 1995. Precise interprocedural dataflow analysis via graph reachability. In POPL. 49–61.Google Scholar
Thomas Reps, Susan Horwitz, Mooly Sagiv, and Genevieve Rosay. 1994. Speeding up slicing. In FSE. 11–20.Google Scholar
Liam Roditty and Uri Zwick. 2004. A fully dynamic reachability algorithm for directed graphs with an almost linear update time. In STOC. 184–191.Google Scholar
Jonathan Rodriguez and Ondřej Lhoták. 2011. Actor-based parallel dataflow analysis. In CC/ETAPS. 179–197.Google Scholar
Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel. 2015. Chaos: Scale-out graph processing from secondary storage. In SOSP. 410–424.Google Scholar
Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-Stream: Edge-centric graph processing using streaming partitions. In SOSP. 472–488.Google Scholar
Cindy Rubio-González, Haryadi S. Gunawi, Ben Liblit, Remzi H. Arpaci-Dusseau, and Andrea C. Arpaci-Dusseau. 2009. Error propagation analysis for file systems. In PLDI. 270–280.Google Scholar
Cindy Rubio-González and Ben Liblit. 2011. Defective error/pointer interactions in the Linux kernel. In ISSTA. 111–121.Google Scholar
Caitlin Sadowski, Edward Aftandilian, Alex Eagle, Liam Miller-Cushon, and Ciera Jaspan. 2018. Lessons from building static analysis tools at Google. Commun. ACM 61, 4 (Mar. 2018), 58–66. DOI:DOI:http://doi.org/10.1145/3188720Google ScholarDigital Library
Mooly Sagiv, Thomas Reps, and Susan Horwitz. 1996. Precise interprocedural dataflow analysis with applications to constant propagation. Theoret. Comput. Sci. 167, 1–2 (1996), 131–170.Google ScholarDigital Library
Dipanjan Sengupta, Shuaiwen Leon Song, Kapil Agarwal, and Karsten Schwan. 2015. GraphReduce: Processing large-scale graphs on accelerator-based systems. In SC15. Association for Computing Machinery, New York, NY. DOI:DOI:http://doi.org/10.1145/2807591.2807655Google Scholar
M. Sharir and A. Pnueli. 1981. Two approaches to interprocedural data flow analysis. In Program Flow Analysis: Theory and Applications, S. Muchnick and N. Jones (Eds.). Prentice Hall, 189–234.Google Scholar
Jiaxin Shi, Youyang Yao, Rong Chen, Haibo Chen, and Feifei Li. 2016. Fast and concurrent RDF queries with RDMA-based distributed graph exploration. In USENIX ATC. 317–332.Google Scholar
Qingkai Shi, Xiao Xiao, Rongxin Wu, Jinguo Zhou, Gang Fan, and Charles Zhang. 2018. Pinpoint: Fast and precise sparse value flow analysis for million lines of code. In PLDI. ACM, 693–706. DOI:DOI:http://doi.org/10.1145/3192366.3192418Google Scholar
X. Shi, X. Luo, J. Liang, P. Zhao, S. Di, B. He, and H. Jin. 2018. Frog: Asynchronous graph processing on GPU with hybrid coloring model. IEEE Trans. Knowl. Data Eng. 30, 1 (2018), 29–42.Google ScholarCross Ref
Alexander Shkapsky, Mohan Yang, Matteo Interlandi, Hsuan Chiu, Tyson Condie, and Carlo Zaniolo. 2016. Big data analytics with datalog queries on spark. In SIGMOD. 1135–1149.Google Scholar
Julian Shun and Guy E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In PPoPP. 135–146.Google ScholarDigital Library
Yannis Smaragdakis, George Balatsouras, and George Kastrinis. 2013. Set-based pre-processing for points-to analysis. In OOPSLA. ACM, New York, NY, 253–270. DOI:DOI:http://doi.org/10.1145/2509136.2509524Google Scholar
Yannis Smaragdakis, Martin Bravenboer, and Ondrej Lhoták. 2011. Pick your contexts well: Understanding object-sensitivity. In POPL. 17–30.Google Scholar
Yannis Smaragdakis, George Kastrinis, and George Balatsouras. 2014. Introspective analysis: Context-sensitivity, across the board. In PLDI. 485–495.Google Scholar
Manu Sridharan and Rastislav Bodik. 2006. Refinement-based context-sensitive points-to analysis for Java. In PLDI. 387–400.Google Scholar
Manu Sridharan, Denis Gopan, Lexin Shan, and Rastislav Bodik. 2005. Demand-driven points-to analysis for Java. In OOPSLA. 59–76.Google Scholar
R. E. Strom and S. Yemini. 1986. Typestate: A programming language concept for enhancing software reliability. IEEE Trans. Softw. Eng. 12, 1 (Jan. 1986), 157–171. DOI:DOI:http://doi.org/10.1109/TSE.1986.6312929Google Scholar
Yu Su, Ding Ye, and Jingling Xue. 2013. Accelerating inclusion-based pointer analysis on heterogeneous CPU-GPU systems. In HiPC. 149–158. DOI:DOI:http://doi.org/10.1109/HiPC.2013.6799110Google Scholar
Yu Su, Ding Ye, and Jingling Xue. 2014. Parallel pointer analysis with CFL-reachability. In ICPP. 451–460. DOI:DOI:http://doi.org/10.1109/ICPP.2014.54Google Scholar
Hao Tang, Xiaoyin Wang, Lingming Zhang, Bing Xie, Lu Zhang, and Hong Mei. 2015. Summary-based context-sensitive data-dependence analysis in presence of callbacks. In POPL. 83–95.Google Scholar
Keval Vora, Rajiv Gupta, and Guoqing Xu. 2016. Synergistic analysis of evolving graphs. ACM Trans. Archit. Code Optim. 13, 4 (2016).Google ScholarDigital Library
Keval Vora, Rajiv Gupta, and Guoqing Xu. 2017. KickStarter: Fast and accurate computations on streaming graphs via trimmed approximations. In ASPLOS.Google Scholar
Keval Vora, Guoqing Xu, and Rajiv Gupta. 2016. Load the edges you need: A generic I/O optimization for disk-based graph processing. In USENIX ATC. 507–522.Google Scholar
Jingjing Wang, Magdalena Balazinska, and Daniel Halperin. 2015. Asynchronous and fault-tolerant recursive datalog evaluation in shared-nothing engines. PVLDB 8, 12 (2015), 1542–1553.Google ScholarDigital Library
Kai Wang, Aftab Hussain, Zhiqiang Zuo, Guoqing Xu, and Ardalan Amiri Sani. 2017. Graspan: A single-machine disk-based graph system for interprocedural static analyses of large-scale systems code. In ASPLOS. 389–404. DOI:DOI:http://doi.org/10.1145/3037697.3037744Google Scholar
Kai Wang, Guoqing Xu, Zhendong Su, and Yu David Liu. 2015. GraphQ: Graph query processing with abstraction refinement—programmable and budget-aware analytical queries over very large graphs on a single PC. In USENIX ATC. 387–401.Google Scholar
Kai Wang, Zhiqiang Zuo, John Thorpe, Tien Quang Nguyen, and Guoqing Harry Xu. 2018. RStream: Marrying relational algebra with streaming for efficient graph mining on a single machine. In OSDI’18. USENIX Association, 763–782.Google Scholar
Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In SOSP. 260–275.Google Scholar
Yangzihao Wang, Yuechao Pan, Andrew Davidson, Yuduo Wu, Carl Yang, Leyuan Wang, Muhammad Osama, Chenshan Yuan, Weitang Liu, Andy T. Riffel, and John D. Owens. 2017. Gunrock: GPU graph analytics. ACM Trans. Parallel Comput. 4, 1 (Aug. 2017).Google ScholarDigital Library
Cathrin Weiss, Cindy Rubio-González, and Ben Liblit. 2015. Database-backed program analysis for scalable error propagation. In ICSE. 586–597.Google Scholar
John Whaley and Monica Lam. 2004. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI. 131–144.Google Scholar
Robert P. Wilson and Monica S. Lam. 1995. Efficient context-sensitive pointer analysis for C programs. In PLDI. ACM, New York, NY, 1–12. DOI:DOI:http://doi.org/10.1145/207110.207111Google Scholar
Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, and Lidong Zhou. 2015. GraM: Scaling graph computation to the trillions. In SoCC. 408–421.Google Scholar
Guoqing Xu and Atanas Rountev. 2008. Merging equivalent contexts for scalable heap-cloning-based context-sensitive points-to analysis. In ISSTA.Google Scholar
Guoqing Xu, Atanas Rountev, and Manu Sridharan. 2009. Scaling CFL-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In ECOOP. 98–122.Google Scholar
Guoqing Xu, Dacong Yan, and Atanas Rountev. 2012. Static detection of loop-invariant data structures. In ECOOP 2012 – Object-Oriented Programming, James Noble (Ed.). Springer Berlin, 738–763.Google ScholarDigital Library
Dacong Yan, Guoqing Xu, and Atanas Rountev. 2011. Demand-driven context-sensitive alias analysis for Java. In ISSTA. 155–165.Google Scholar
Junfeng Yang, Can Sar, and Dawson Engler. 2006. EXPLODE: A lightweight, general system for finding serious storage system errors. In OSDI. 10–10.Google Scholar
Mihalis Yannakakis. 1990. Graph-theoretic methods in database theory. In PODS.Google Scholar
Daniel M. Yellin. 1993. Speeding up dynamic transitive closure for bounded degree graphs. Acta Inf. 30, 4 (1993), 369–384.Google ScholarDigital Library
S. Yong, S. Horwitz, and T. Reps. 1999. Pointer analysis for programs with structures and casting. In PLDI. 91–103.Google Scholar
Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm. 2014. Simple testing can prevent most critical failures: An analysis of production failures in distributed data-intensive systems. In OSDI. USENIX Association, 249–265. Retrieved from https://www.usenix.org/conference/osdi14/technical-sessions/presentation/yuan.Google Scholar
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. InNSDI. USENIX Association, 2.Google ScholarDigital Library
Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li, and Weimin Zheng. 2016. Exploring the hidden dimension in graph processing. In OSDI. 285–300.Google Scholar
Qirun Zhang, Michael R. Lyu, Hao Yuan, and Zhendong Su. 2013. Fast algorithms for Dyck-CFL-reachability with applications to alias analysis. In PLDI. 435–446.Google Scholar
Qirun Zhang and Zhendong Su. 2017. Context-sensitive data dependence analysis via linear conjunctive language reachability. In POPL. 344–358.Google Scholar
Qirun Zhang, Xiao Xiao, Charles Zhang, Hao Yuan, and Zhendong Su. 2014. Efficient subcubic alias analysis for C. In OOPSLA. 829–845.Google Scholar
Xin Zhang, Ravi Mangal, Radu Grigore, Mayur Naik, and Hongseok Yang. 2014. On abstraction refinement for program analyses in datalog. In PLDI. ACM, New York, NY, 239–248. DOI:DOI:https://doi.org/10.1145/2594291.2594327Google Scholar
Jisheng Zhao, Michael G. Burke, and Vivek Sarkar. 2018. Parallel sparse flow-sensitive points-to analysis. InCC. Association for Computing Machinery, New York, NY, 59–70.Google Scholar
Yue Zhao, Guoyang Chen, Chunhua Liao, and Xipeng Shen. 2016. Towards ontology-based program analysis. In ECOOP (Leibniz International Proceedings in Informatics (LIPIcs)), Shriram Krishnamurthi and Benjamin S. Lerner (Eds.), Vol. 56. Dagstuhl, Germany, 26:1–26:25. DOI:DOI:http://doi.org/10.4230/LIPIcs.ECOOP.2016.26Google Scholar
Da Zheng, Disa Mhembere, Randal Burns, Joshua Vogelstein, Carey E Priebe, and Alexander S. Szalay. 2015. FlashGraph: Processing billion-node graphs on an array of commodity SSDs. In FAST. 45–58.Google Scholar
Xin Zheng and Radu Rugina. 2008. Demand-driven alias analysis for C. In POPL. 197–208.Google Scholar
Jianlong Zhong and Bingsheng He. 2014. Medusa: Simplified graph processing on GPUs. IEEE Trans. Parallel Distrib. Syst. 25, 6 (June 2014), 1543–1552. DOI:DOI:http://doi.org/10.1109/TPDS.2013.111Google ScholarDigital Library
Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In OSDI. 301–316.Google Scholar
Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX ATC. 375–386.Google Scholar
Zhiqiang Zuo, Rong Gu, Xi Jiang, Zhaokang Wang, Yihua Huang, Linzhang Wang, and Xuandong Li. 2019. BigSpa: An efficient interprocedural static analysis engine in the cloud. In IPDPS.Google Scholar
Zhiqiang Zuo, John Thorpe, Yifei Wang, Qiuhong Pan, Shenming Lu, Kai Wang, Harry Xu, Linzhang Wang, and Xuandong Li. 2019. Grapple: A graph system for static finite-state property checking of large-scale systems code. In EuroSys. ACM.Google ScholarDigital Library
Zhiqiang Zuo, Yiyu Zhang, Qiuhong Pan, Shenming Lu, Yue Li, Linzhang Wang, Xuandong Li, and Guoqing Harry Xu. 2021. Chianina: An evolving graph system for flow- and context-sensitive analyses of million lines of C code. In PLDI. Association for Computing Machinery, New York, NY. DOI:DOI:http://doi.org/10.1145/3453483.3454085Google Scholar

Index Terms

Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan

Recommendations

Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems

There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence ...
Read More
Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code
Asplos'17

There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence ...
Read More
Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code
ASPLOS '17

There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Computer Systems Volume 38, Issue 1-2
May 2020
178 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/3474395
Editor:
Michael Swift
University of Wisconsin, USA
Issue’s Table of Contents
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 July 2021
- Accepted: 1 May 2021
- Revised: 1 February 2021
- Received: 1 August 2020
Published in tocs Volume 38, Issue 1-2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
disk-based systems
graph processing
static analysis
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 841
  Total Downloads
- Downloads (Last 12 months)283
- Downloads (Last 6 weeks)31
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code

Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code

Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code

Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code

Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media