Abstract
There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence of many sophisticated interprocedural analyses, few of them have been employed to improve checkers for systems code due to their complex implementations and poor scalability.
In this article, we revisit the scalability problem of interprocedural static analysis from a “Big Data” perspective. That is, we turn sophisticated code analysis into Big Data analytics and leverage novel data processing techniques to solve this traditional programming language problem. We propose Graspan, a disk-based parallel graph system that uses an edge-pair centric computation model to compute dynamic transitive closures on very large program graphs. We develop two backends for Graspan, namely, Graspan-C running on CPUs and Graspan-G on GPUs, and present their designs in the article. Graspan-C can analyze large-scale systems code on any commodity PC, while, if GPUs are available, Graspan-G can be readily used to achieve orders of magnitude speedup by harnessing a GPU’s massive parallelism.
We have implemented fully context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases written in multiple languages such as Linux and Apache Hadoop demonstrates that their Graspan implementations are language-independent, scale to millions of lines of code, and are much simpler than their original implementations. Moreover, we show that these analyses can be used to uncover many real-world bugs in large-scale systems code.
- 2015. The FindBugs Java Static Checker. Retrieved from http://findbugs.sourceforge.net/.Google Scholar
- 2016. The Coverity Code Checker. Retrieved from http://www.coverity.com/.Google Scholar
- 2016. The GrammaTech CodeSonar Static Checker. https://www.grammatech.com/codesonar-cc.Google Scholar
- 2016. The HP Fortify Static Checker. https://www.microfocus.com/en-us/cyberres/application-security/static-code-analyzer.Google Scholar
- 2016. The KlocWork Static Checker. https://www.perforce.com/products/klocwork.Google Scholar
- 2016. The LogicBlox Datalog Engine. Retrieved from http://www.logicblox.com/.Google Scholar
- 2016. Personal Communication with John Criswell.Google Scholar
- 2020. The Datalog Engine. Retrieved from http://souffle-lang.github.io/.Google Scholar
- Zhiyuan Ai, Mingxing Zhang, Yongwei Wu, Xuehai Qian, Kang Chen, and Weimin Zheng. 2017. Squeezing out all the value of loaded data: An out-of-core graph processing system with reduced disk I/O. In USENIX ATC. 125–137.Google Scholar
- Alex Aiken, Suhabe Bugrara, Isil Dillig, Thomas Dillig, Brian Hackett, and Peter Hawkins. 2007. An overview of the saturn project. In PASTE. 43–48.Google Scholar
- Aws Albarghouthi, Rahul Kumar, Aditya V. Nori, and Sriram K. Rajamani. 2012. Parallelizing top-down interprocedural analyses. In PLDI. ACM, 217–228. DOI:DOI:http://doi.org/10.1145/2254064.2254091Google Scholar
- Rajeev Alur. 2007. Marrying words and trees. In PODS. 233–242.Google Scholar
- Rajeev Alur, Michael Benedikt, Kousha Etessami, Patrice Godefroid, Thomas Reps, and Mihalis Yannakakis. 2005. Analysis of recursive state machines. ACM Trans. Program. Lang. Syst. 27, 4 (2005), 786–818.Google ScholarDigital Library
- Rajeev Alur and P. Madhusudan. 2004. Visibly pushdown languages. In STOC. 202–211.Google Scholar
- M. D. Atkinson, J.-R. Sack, N. Santoro, and T. Strothotte. 1986. Min-max heaps and generalized priority queues. Commun. ACM 29, 10 (1986), 996–1000.Google ScholarDigital Library
- Thomas Ball, Byron Cook, Vladimir Levin, and Sriram K. Rajamani. 2004. SLAM and static driver verifier: Technology transfer of formal methods inside microsoft. In IFM. 1–20.Google Scholar
- Thomas Ball, Rupak Majumdar, Todd Millstein, and Sriram K. Rajamani. 2001. Automatic predicate abstraction of C programs. In PLDI. 203–213.Google Scholar
- Osbert Bastani, Saswat Anand, and Alex Aiken. 2015. Specification inference using context-free language reachability. In POPL. 553–566.Google Scholar
- Thorsten Blaß and Michael Philippsen. 2019. GPU-accelerated fixpoint algorithms for faster compiler analyses. In CC. Association for Computing Machinery, New York, NY, 122–134.Google Scholar
- Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly declarative specification of sophisticated points-to analyses. In OOPSLA. 243–262.Google Scholar
- Fraser Brown, Andres Notzli, and Dawson Engler. 2016. How to build static checking systems using orders of magnitude less code. In ASPLOS. 143–157.Google Scholar
- Yingyi Bu, Vinayak Borkar, Jianfeng Jia, Michael J. Carey, and Tyson Condie. 2014. Pregelix: Big(Ger) graph analytics on a dataflow engine. Proc. VLDB Endow. 8, 2 (Oct. 2014), 161–172.Google ScholarDigital Library
- Suhabe Bugrara and Alex Aiken. 2008. Verifying the safety of user pointer dereferences. In IEEE S&P. 325–338.Google Scholar
- Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI. 209–224.Google ScholarDigital Library
- Cristian Cadar, Vijay Ganesh, Peter M. Pawlowski, David L. Dill, and Dawson R. Engler. 2006. EXE: Automatically generating inputs of death. In CCS. 322–335.Google Scholar
- Cheng Cai, Qirun Zhang, Zhiqiang Zuo, Khanh Nguyen, Guoqing Xu, and Zhendong Su. 2018. Calling-to-reference context translation via constraint-guided CFL-reachability. In PLDI. Association for Computing Machinery, New York, NY, 196–210. DOI:DOI:http://doi.org/10.1145/3192366.3192378Google Scholar
- Cristiano Calcagno, Dino Distefano, Jeremy Dubreil, Dominik Gabi, Pieter Hooimeijer, Martino Luca, Peter O’Hearn, Irene Papakonstantinou, Jim Purbrick, and Dulma Rodriguez. 2015. Moving fast with software verification. In NASA Formal Methods. Cham, 3–11.Google Scholar
- Rong Chen, Xin Ding, Peng Wang, Haibo Chen, Binyu Zang, and Haibing Guan. 2014. Computation and communication efficient graph processing with distributed immutable view. In HPDC. 215–226.Google Scholar
- Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. In EuroSys. 1:1–1:15.Google Scholar
- Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson Engler. 2001. An empirical study of operating systems errors. In SOSP. 73–88.Google Scholar
- Manuvir Das, Sorin Lerner, and Mark Seigle. 2002. ESP: Path-sensitive program verification in polynomial time. In PLDI. 57–68.Google ScholarDigital Library
- Robert DeLine and Manuel Fähndrich. 2001. Enforcing high-level protocols in low-level software. In PLDI. 59–69.Google Scholar
- Nurit Dor, Stephen Adams, Manuvir Das, and Zhe Yang. 2004. Software validation via scalable path-sensitive value flow analysis. In ISSTA. 12–22.Google Scholar
- Maryam Emami, Rakesh Ghiya, and Laurie J. Hendren. 1994. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In PLDI. ACM, New York, NY, 242–256. DOI:DOI:http://doi.org/10.1145/178243.178264Google Scholar
- Dawson Engler. 2011. Making finite verification of raw C code easier than writing a test case. In RV.Google Scholar
- Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem. 2000. Checking system rules using system-specific, programmer-written compiler extensions. In OSDI. 1–1.Google Scholar
- Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. 2001. Bugs as deviant behavior: A general approach to inferring errors in systems code. In SOSP. 57–72.Google Scholar
- Stephen Fink, Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay. 2006. Effective typestate verification in the presence of aliasing. In ISSTA. 133–144.Google Scholar
- Jeffrey S. Foster, Manuel Fähndrich, and Alexander Aiken. 1999. A theory of type qualifiers. In PLDI. 192–203.Google Scholar
- Zhisong Fu, Michael Personick, and Bryan Thompson. 2014. MapGraph: A high level API for fast development of high performance graph analytics on GPUs. In GRADES. Association for Computing Machinery, New York, NY, 1–6.Google Scholar
- Diego Garbervetsky, Edgardo Zoppi, and Benjamin Livshits. 2017. Toward full elasticity in distributed static analysis: The case of callgraph analysis. In ESEC/FSE. ACM, New York, NY, 442–453. DOI:DOI:http://doi.org/10.1145/3106237.3106261Google Scholar
- Abdullah Gharaibeh, Elizeu Santos-Neto, Lauro Beltrão Costa, and Matei Ripeanu. 2013. Efficient large-scale graph processing on hybrid CPU and GPU systems. CoRR abs/1312.3018 (2013).Google Scholar
- Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In OSDI. 17–30.Google Scholar
- Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In OSDI. 599–613.Google Scholar
- Rong Gu, Zhiqiang Zuo, Xi Jiang, Han Yin, Zhaokang Wang, Linzhang Wang, Xuandong Li, and Yihua Huang. 2021. Towards efficient large-scale interprocedural program static analysis on distributed data-parallel computation. IEEE Trans. Parallel Distrib. Syst. 32, 4 (Apr. 2021), 867–883. DOI:DOI:http://doi.org/10.1109/TPDS.2020.3036190Google ScholarDigital Library
- Seth Hallem, Benjamin Chelf, Yichen Xie, and Dawson Engler. 2002. A system and language for building system-specific, static analyses. In PLDI. 69–82.Google Scholar
- Wei Han, Daniel Mawhirter, Bo Wu, and Matthew Buland. 2017. Graphie: Large-scale asynchronous graph traversals on just a GPU. In PACT. 233–245. DOI:DOI:http://doi.org/10.1109/PACT.2017.41Google Scholar
- Wook-Shin Han, Sangyeon Lee, Kyungyeol Park, Jeong-Hoon Lee, Min-Soo Kim, Jinha Kim, and Hwanjo Yu. 2013. TurboGraph: A fast parallel graph engine handling billion-scale graphs in a single PC. In KDD. 77–85.Google Scholar
- Sungpack Hong, Sang Kyun Kim, Tayo Oguntebi, and Kunle Olukotun. 2011. Accelerating CUDA graph algorithms at maximum warp. InPPoPP. Association for Computing Machinery, New York, NY, 267–276. DOI:DOI:http://doi.org/10.1145/1941553.1941590Google Scholar
- Susan Horwitz, Thomas Reps, and Mooly Sagiv. 1995. Demand interprocedural dataflow analysis. In FSE. 104–115.Google Scholar
- G. F. Italiano. 1986. Amortized efficiency of a path retrieval data structure. Theor. Comput. Sci. 48, 2–3 (1986), 273–281.Google ScholarDigital Library
- Minseok Jeon, Sehun Jeong, and Hakjoo Oh. 2018. Precise and scalable points-to analysis via data-driven context tunneling. Proc. ACM Program. Lang. 2, OOPSLA (Oct. 2018). DOI:DOI:http://doi.org/10.1145/3276510Google ScholarDigital Library
- Sehun Jeong, Minseok Jeon, Sungdeok Cha, and Hakjoo Oh. 2017. Data-driven context-sensitivity for points-to analysis. Proc. ACM Program. Lang. 1, OOPSLA (Oct. 2017). DOI:DOI:http://doi.org/10.1145/3133924Google ScholarDigital Library
- Zhihao Jia, Yongkee Kwon, Galen Shipman, Pat McCormick, Mattan Erez, and Alex Aiken. 2017. A distributed multi-GPU system for fast graph processing. Proc. VLDB Endow. 11, 3 (Nov. 2017), 297–310.Google ScholarDigital Library
- Rob Johnson and David Wagner. 2004. Finding user/kernel pointer bugs with type inference. In USENIX Security. 9–9.Google Scholar
- Herbert Jordan, Bernhard Scholz, and Pavle Subotić. 2016. Soufflé: On synthesis of program analyzers. In Computer Aided Verification, Swarat Chaudhuri and Azadeh Farzan (Eds.). Springer International Publishing, Cham, 422–430.Google Scholar
- George Kastrinis and Yannis Smaragdakis. 2013. Hybrid context-sensitivity for points-to analysis. In PLDI. 423–434.Google Scholar
- Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan. 2014. CuSha: Vertex-centric graph processing on GPUs. In HPDC. Association for Computing Machinery, New York, NY, 239–252.Google Scholar
- Min-Soo Kim, Kyuhyeon An, Himchan Park, Hyunseok Seo, and Jinwook Kim. 2016. GTS: A fast and scalable graph processing method based on streaming topology to GPUs. In SIGMOD. Association for Computing Machinery, New York, NY, 447–461.Google ScholarDigital Library
- John Kodumal and Alex Aiken. 2004. The set constraint/CFL reachability connection in practice. In PLDI. 207–218.Google Scholar
- John Kodumal and Alex Aiken. 2007. Regularly annotated set constraints. In PLDI. 331–341.Google Scholar
- Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. In OSDI. 31–46.Google Scholar
- Monica S. Lam, Stephen Guo, and Jiwon Seo. 2013. SociaLite: Datalog extensions for efficient social network analysis. In ICDE. 278–289.Google Scholar
- Butler W. Lampson. 1983. Hints for computer system design. In SOSP. 33–48.Google Scholar
- Chris Lattner, Andrew Lenharth, and Vikram Adve. 2007. Making context-sensitive points-to analysis with heap cloning practical for the real world. In PLDI. 278–289.Google Scholar
- Ondřej Lhoták and Laurie Hendren. 2003. Scaling Java points-to analysis using SPARK. InCC. Springer-Verlag, Berlin, 153–169.Google Scholar
- Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis. 2018. Precision-guided context sensitivity for pointer analysis. Proc. ACM Program. Lang. 2, OOPSLA (Oct. 2018). DOI:DOI:http://doi.org/10.1145/3276511Google ScholarDigital Library
- Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis. 2018. Scalability-first pointer analysis with self-tuning context-sensitivity. In ESEC/FSE. ACM, New York, NY, 129–140. DOI:DOI:http://doi.org/10.1145/3236024.3236041Google Scholar
- Zhenmin Li, Shan Lu, Suvda Myagmar, and Yuanyuan Zhou. 2004. CP-Miner: A tool for finding copy-paste and related bugs in operating system code. In OSDI. 20–20.Google Scholar
- Zhenmin Li and Yuanyuan Zhou. 2005. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code. In FSE. 306–315.Google Scholar
- Zhiyuan Lin, Minsuk Kahng, Kaeser Md. Sabrin, Duen Horng (Polo) Chau, Ho Lee, , and U. Kang. 2014. MMap: Fast billion-scale graph computation on a PC via memory mapping. In BigData. 159–164.Google Scholar
- Ying Liu and Ana Milanova. 2008. Static analysis for inference of explicit information flow. In PASTE. 50–56.Google Scholar
- Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed GraphLab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5, 8 (2012), 716–727.Google ScholarDigital Library
- Jingbo Lu and Jingling Xue. 2019. Precision-preserving yet fast object-sensitive pointer analysis with partial context sensitivity. Proc. ACM Program. Lang. 3, OOPSLA (Oct. 2019). DOI:DOI:http://doi.org/10.1145/3360574Google ScholarDigital Library
- Steffen Maass, Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, and Taesoo Kim. 2017. Mosaic: Processing a trillion-edge graph on a single machine. In EuroSys. 527–543.Google Scholar
- Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski, and Google Inc. 2010. Pregel: A system for large-scale graph processing. In SIGMOD. 135–146.Google ScholarDigital Library
- David Melski and Thomas Reps. 2000. Interconvertibility of a class of set constraints and context-free-language reachability. Theoret. Comput. Sci. 248 (2000), 29–98.Google ScholarDigital Library
- Mario Mendez-Lojo, Martin Burtscher, and Keshav Pingali. 2012. A GPU implementation of inclusion-based points-to analysis. In PPoPP. ACM, 107–116. DOI:DOI:http://doi.org/10.1145/2145816.2145831Google Scholar
- Mario Méndez-Lojo, Augustine Mathew, and Keshav Pingali. 2010. Parallel inclusion-based points-to analysis. In OOPSLA10. ACM, 428–443. DOI:DOI:http://doi.org/10.1145/1869459.1869495Google Scholar
- Matthew Might, Yannis Smaragdakis, and David Van Horn. 2010. Resolving and exploiting the k-CFA paradox: Illuminating functional vs. object-oriented program analysis. In PLDI. ACM, New York, NY, 305–315. DOI:DOI:http://doi.org/10.1145/1806596.1806631Google Scholar
- Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2005. Parameterized object sensitivity for points-to analysis for Java. ACM Trans. Softw. Eng. Methodol. 14, 1 (Jan. 2005), 1–41. DOI:DOI:http://doi.org/10.1145/1044834.1044835Google ScholarDigital Library
- Brian R. Murphy and Monica S. Lam. 1999. Program analysis with partial transfer functions. In PEPM. ACM, New York, NY, 94–103. DOI:DOI:http://doi.org/10.1145/328690.328703Google Scholar
- Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A timely dataflow system. In SOSP. 439–455.Google ScholarDigital Library
- Vaivaswatha Nagaraj and R. Govindarajan. 2013. Parallel flow-sensitive pointer analysis by graph-rewriting. In PACT. IEEE Press, 19–28.Google Scholar
- George C. Necula, Jeremy Condit, Matthew Harren, Scott McPeak, and Westley Weimer. 2005. CCured: Type-safe retrofitting of legacy software. ACM Trans. Program. Lang. Syst. 27, 3 (2005), 477–526.Google ScholarDigital Library
- Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In SOSP. 456–471.Google Scholar
- Hakjoo Oh, Wonchan Lee, Kihong Heo, Hongseok Yang, and Kwangkeun Yi. 2014. Selective context-sensitivity guided by impact pre-analysis. In PLDI. Association for Computing Machinery, New York, NY, 475–484.Google Scholar
- Yoann Padioleau, Julia Lawall, René Rydhof Hansen, and Gilles Muller. 2008. Documenting and automating collateral evolutions in linux device drivers. In EuroSys. 247–260.Google Scholar
- Nicolas Palix, Gaël Thomas, Suman Saha, Christophe Calvès, Julia Lawall, and Gilles Muller. 2011. Faults in Linux: Ten years later. In ASPLOS. 305–318.Google Scholar
- Y. Pan, Y. Wang, Y. Wu, C. Yang, and J. D. Owens. 2017. Multi-GPU graph analytics. In IPDPS. 479–490.Google Scholar
- Roger Pearce, Maya Gokhale, and Nancy M. Amato. 2010. Multithreaded asynchronous graph traversal for in-memory and semi-external memory. In SC. 1–11.Google Scholar
- J. Rehof and M. Fähndrich. 2001. Type-based flow analysis: From polymorphic subtyping to CFL-reachability. In POPL. 54–66.Google Scholar
- Thomas Reps. 1994. Solving demand versions of interprocedural analysis problems. In CC. 389–403.Google Scholar
- Tom Reps. 1995. Shape analysis as a generalized path problem. In PEPM. 1–11.Google Scholar
- Thomas Reps. 1998. Program analysis via graph reachability. Inf. Softw. Technol. 40, 11–12 (1998), 701–726.Google ScholarCross Ref
- T. Reps, S. Horwitz, and M. Sagiv. 1995. Precise interprocedural dataflow analysis via graph reachability. In POPL. 49–61.Google Scholar
- Thomas Reps, Susan Horwitz, Mooly Sagiv, and Genevieve Rosay. 1994. Speeding up slicing. In FSE. 11–20.Google Scholar
- Liam Roditty and Uri Zwick. 2004. A fully dynamic reachability algorithm for directed graphs with an almost linear update time. In STOC. 184–191.Google Scholar
- Jonathan Rodriguez and Ondřej Lhoták. 2011. Actor-based parallel dataflow analysis. In CC/ETAPS. 179–197.Google Scholar
- Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel. 2015. Chaos: Scale-out graph processing from secondary storage. In SOSP. 410–424.Google Scholar
- Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-Stream: Edge-centric graph processing using streaming partitions. In SOSP. 472–488.Google Scholar
- Cindy Rubio-González, Haryadi S. Gunawi, Ben Liblit, Remzi H. Arpaci-Dusseau, and Andrea C. Arpaci-Dusseau. 2009. Error propagation analysis for file systems. In PLDI. 270–280.Google Scholar
- Cindy Rubio-González and Ben Liblit. 2011. Defective error/pointer interactions in the Linux kernel. In ISSTA. 111–121.Google Scholar
- Caitlin Sadowski, Edward Aftandilian, Alex Eagle, Liam Miller-Cushon, and Ciera Jaspan. 2018. Lessons from building static analysis tools at Google. Commun. ACM 61, 4 (Mar. 2018), 58–66. DOI:DOI:http://doi.org/10.1145/3188720Google ScholarDigital Library
- Mooly Sagiv, Thomas Reps, and Susan Horwitz. 1996. Precise interprocedural dataflow analysis with applications to constant propagation. Theoret. Comput. Sci. 167, 1–2 (1996), 131–170.Google ScholarDigital Library
- Dipanjan Sengupta, Shuaiwen Leon Song, Kapil Agarwal, and Karsten Schwan. 2015. GraphReduce: Processing large-scale graphs on accelerator-based systems. In SC15. Association for Computing Machinery, New York, NY. DOI:DOI:http://doi.org/10.1145/2807591.2807655Google Scholar
- M. Sharir and A. Pnueli. 1981. Two approaches to interprocedural data flow analysis. In Program Flow Analysis: Theory and Applications, S. Muchnick and N. Jones (Eds.). Prentice Hall, 189–234.Google Scholar
- Jiaxin Shi, Youyang Yao, Rong Chen, Haibo Chen, and Feifei Li. 2016. Fast and concurrent RDF queries with RDMA-based distributed graph exploration. In USENIX ATC. 317–332.Google Scholar
- Qingkai Shi, Xiao Xiao, Rongxin Wu, Jinguo Zhou, Gang Fan, and Charles Zhang. 2018. Pinpoint: Fast and precise sparse value flow analysis for million lines of code. In PLDI. ACM, 693–706. DOI:DOI:http://doi.org/10.1145/3192366.3192418Google Scholar
- X. Shi, X. Luo, J. Liang, P. Zhao, S. Di, B. He, and H. Jin. 2018. Frog: Asynchronous graph processing on GPU with hybrid coloring model. IEEE Trans. Knowl. Data Eng. 30, 1 (2018), 29–42.Google ScholarCross Ref
- Alexander Shkapsky, Mohan Yang, Matteo Interlandi, Hsuan Chiu, Tyson Condie, and Carlo Zaniolo. 2016. Big data analytics with datalog queries on spark. In SIGMOD. 1135–1149.Google Scholar
- Julian Shun and Guy E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In PPoPP. 135–146.Google ScholarDigital Library
- Yannis Smaragdakis, George Balatsouras, and George Kastrinis. 2013. Set-based pre-processing for points-to analysis. In OOPSLA. ACM, New York, NY, 253–270. DOI:DOI:http://doi.org/10.1145/2509136.2509524Google Scholar
- Yannis Smaragdakis, Martin Bravenboer, and Ondrej Lhoták. 2011. Pick your contexts well: Understanding object-sensitivity. In POPL. 17–30.Google Scholar
- Yannis Smaragdakis, George Kastrinis, and George Balatsouras. 2014. Introspective analysis: Context-sensitivity, across the board. In PLDI. 485–495.Google Scholar
- Manu Sridharan and Rastislav Bodik. 2006. Refinement-based context-sensitive points-to analysis for Java. In PLDI. 387–400.Google Scholar
- Manu Sridharan, Denis Gopan, Lexin Shan, and Rastislav Bodik. 2005. Demand-driven points-to analysis for Java. In OOPSLA. 59–76.Google Scholar
- R. E. Strom and S. Yemini. 1986. Typestate: A programming language concept for enhancing software reliability. IEEE Trans. Softw. Eng. 12, 1 (Jan. 1986), 157–171. DOI:DOI:http://doi.org/10.1109/TSE.1986.6312929Google Scholar
- Yu Su, Ding Ye, and Jingling Xue. 2013. Accelerating inclusion-based pointer analysis on heterogeneous CPU-GPU systems. In HiPC. 149–158. DOI:DOI:http://doi.org/10.1109/HiPC.2013.6799110Google Scholar
- Yu Su, Ding Ye, and Jingling Xue. 2014. Parallel pointer analysis with CFL-reachability. In ICPP. 451–460. DOI:DOI:http://doi.org/10.1109/ICPP.2014.54Google Scholar
- Hao Tang, Xiaoyin Wang, Lingming Zhang, Bing Xie, Lu Zhang, and Hong Mei. 2015. Summary-based context-sensitive data-dependence analysis in presence of callbacks. In POPL. 83–95.Google Scholar
- Keval Vora, Rajiv Gupta, and Guoqing Xu. 2016. Synergistic analysis of evolving graphs. ACM Trans. Archit. Code Optim. 13, 4 (2016).Google ScholarDigital Library
- Keval Vora, Rajiv Gupta, and Guoqing Xu. 2017. KickStarter: Fast and accurate computations on streaming graphs via trimmed approximations. In ASPLOS.Google Scholar
- Keval Vora, Guoqing Xu, and Rajiv Gupta. 2016. Load the edges you need: A generic I/O optimization for disk-based graph processing. In USENIX ATC. 507–522.Google Scholar
- Jingjing Wang, Magdalena Balazinska, and Daniel Halperin. 2015. Asynchronous and fault-tolerant recursive datalog evaluation in shared-nothing engines. PVLDB 8, 12 (2015), 1542–1553.Google ScholarDigital Library
- Kai Wang, Aftab Hussain, Zhiqiang Zuo, Guoqing Xu, and Ardalan Amiri Sani. 2017. Graspan: A single-machine disk-based graph system for interprocedural static analyses of large-scale systems code. In ASPLOS. 389–404. DOI:DOI:http://doi.org/10.1145/3037697.3037744Google Scholar
- Kai Wang, Guoqing Xu, Zhendong Su, and Yu David Liu. 2015. GraphQ: Graph query processing with abstraction refinement—programmable and budget-aware analytical queries over very large graphs on a single PC. In USENIX ATC. 387–401.Google Scholar
- Kai Wang, Zhiqiang Zuo, John Thorpe, Tien Quang Nguyen, and Guoqing Harry Xu. 2018. RStream: Marrying relational algebra with streaming for efficient graph mining on a single machine. In OSDI’18. USENIX Association, 763–782.Google Scholar
- Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In SOSP. 260–275.Google Scholar
- Yangzihao Wang, Yuechao Pan, Andrew Davidson, Yuduo Wu, Carl Yang, Leyuan Wang, Muhammad Osama, Chenshan Yuan, Weitang Liu, Andy T. Riffel, and John D. Owens. 2017. Gunrock: GPU graph analytics. ACM Trans. Parallel Comput. 4, 1 (Aug. 2017).Google ScholarDigital Library
- Cathrin Weiss, Cindy Rubio-González, and Ben Liblit. 2015. Database-backed program analysis for scalable error propagation. In ICSE. 586–597.Google Scholar
- John Whaley and Monica Lam. 2004. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI. 131–144.Google Scholar
- Robert P. Wilson and Monica S. Lam. 1995. Efficient context-sensitive pointer analysis for C programs. In PLDI. ACM, New York, NY, 1–12. DOI:DOI:http://doi.org/10.1145/207110.207111Google Scholar
- Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, and Lidong Zhou. 2015. GraM: Scaling graph computation to the trillions. In SoCC. 408–421.Google Scholar
- Guoqing Xu and Atanas Rountev. 2008. Merging equivalent contexts for scalable heap-cloning-based context-sensitive points-to analysis. In ISSTA.Google Scholar
- Guoqing Xu, Atanas Rountev, and Manu Sridharan. 2009. Scaling CFL-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In ECOOP. 98–122.Google Scholar
- Guoqing Xu, Dacong Yan, and Atanas Rountev. 2012. Static detection of loop-invariant data structures. In ECOOP 2012 – Object-Oriented Programming, James Noble (Ed.). Springer Berlin, 738–763.Google ScholarDigital Library
- Dacong Yan, Guoqing Xu, and Atanas Rountev. 2011. Demand-driven context-sensitive alias analysis for Java. In ISSTA. 155–165.Google Scholar
- Junfeng Yang, Can Sar, and Dawson Engler. 2006. EXPLODE: A lightweight, general system for finding serious storage system errors. In OSDI. 10–10.Google Scholar
- Mihalis Yannakakis. 1990. Graph-theoretic methods in database theory. In PODS.Google Scholar
- Daniel M. Yellin. 1993. Speeding up dynamic transitive closure for bounded degree graphs. Acta Inf. 30, 4 (1993), 369–384.Google ScholarDigital Library
- S. Yong, S. Horwitz, and T. Reps. 1999. Pointer analysis for programs with structures and casting. In PLDI. 91–103.Google Scholar
- Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm. 2014. Simple testing can prevent most critical failures: An analysis of production failures in distributed data-intensive systems. In OSDI. USENIX Association, 249–265. Retrieved from https://www.usenix.org/conference/osdi14/technical-sessions/presentation/yuan.Google Scholar
- Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. InNSDI. USENIX Association, 2.Google ScholarDigital Library
- Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li, and Weimin Zheng. 2016. Exploring the hidden dimension in graph processing. In OSDI. 285–300.Google Scholar
- Qirun Zhang, Michael R. Lyu, Hao Yuan, and Zhendong Su. 2013. Fast algorithms for Dyck-CFL-reachability with applications to alias analysis. In PLDI. 435–446.Google Scholar
- Qirun Zhang and Zhendong Su. 2017. Context-sensitive data dependence analysis via linear conjunctive language reachability. In POPL. 344–358.Google Scholar
- Qirun Zhang, Xiao Xiao, Charles Zhang, Hao Yuan, and Zhendong Su. 2014. Efficient subcubic alias analysis for C. In OOPSLA. 829–845.Google Scholar
- Xin Zhang, Ravi Mangal, Radu Grigore, Mayur Naik, and Hongseok Yang. 2014. On abstraction refinement for program analyses in datalog. In PLDI. ACM, New York, NY, 239–248. DOI:DOI:https://doi.org/10.1145/2594291.2594327Google Scholar
- Jisheng Zhao, Michael G. Burke, and Vivek Sarkar. 2018. Parallel sparse flow-sensitive points-to analysis. InCC. Association for Computing Machinery, New York, NY, 59–70.Google Scholar
- Yue Zhao, Guoyang Chen, Chunhua Liao, and Xipeng Shen. 2016. Towards ontology-based program analysis. In ECOOP (Leibniz International Proceedings in Informatics (LIPIcs)), Shriram Krishnamurthi and Benjamin S. Lerner (Eds.), Vol. 56. Dagstuhl, Germany, 26:1–26:25. DOI:DOI:http://doi.org/10.4230/LIPIcs.ECOOP.2016.26Google Scholar
- Da Zheng, Disa Mhembere, Randal Burns, Joshua Vogelstein, Carey E Priebe, and Alexander S. Szalay. 2015. FlashGraph: Processing billion-node graphs on an array of commodity SSDs. In FAST. 45–58.Google Scholar
- Xin Zheng and Radu Rugina. 2008. Demand-driven alias analysis for C. In POPL. 197–208.Google Scholar
- Jianlong Zhong and Bingsheng He. 2014. Medusa: Simplified graph processing on GPUs. IEEE Trans. Parallel Distrib. Syst. 25, 6 (June 2014), 1543–1552. DOI:DOI:http://doi.org/10.1109/TPDS.2013.111Google ScholarDigital Library
- Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In OSDI. 301–316.Google Scholar
- Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX ATC. 375–386.Google Scholar
- Zhiqiang Zuo, Rong Gu, Xi Jiang, Zhaokang Wang, Yihua Huang, Linzhang Wang, and Xuandong Li. 2019. BigSpa: An efficient interprocedural static analysis engine in the cloud. In IPDPS.Google Scholar
- Zhiqiang Zuo, John Thorpe, Yifei Wang, Qiuhong Pan, Shenming Lu, Kai Wang, Harry Xu, Linzhang Wang, and Xuandong Li. 2019. Grapple: A graph system for static finite-state property checking of large-scale systems code. In EuroSys. ACM.Google ScholarDigital Library
- Zhiqiang Zuo, Yiyu Zhang, Qiuhong Pan, Shenming Lu, Yue Li, Linzhang Wang, Xuandong Li, and Guoqing Harry Xu. 2021. Chianina: An evolving graph system for flow- and context-sensitive analyses of million lines of C code. In PLDI. Association for Computing Machinery, New York, NY. DOI:DOI:http://doi.org/10.1145/3453483.3454085Google Scholar
Index Terms
- Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan
Recommendations
Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating SystemsThere is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence ...
Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code
Asplos'17There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence ...
Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code
ASPLOS '17There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence ...
Comments