Abstract
We give an efficient perfect sampling algorithm for weighted, connected induced subgraphs (or graphlets) of rooted, bounded degree graphs. Our algorithm utilizes a vertex-percolation process with a carefully chosen rejection filter and works under a percolation subcriticality condition. We show that this condition is optimal in the sense that the task of (approximately) sampling weighted rooted graphlets becomes impossible in finite expected time for infinite graphs and intractable for finite graphs when the condition does not hold. We apply our sampling algorithm as a subroutine to give near linear-time perfect sampling algorithms for polymer models and weighted non-rooted graphlets in finite graphs, two widely studied yet very different problems. This new perfect sampling algorithm for polymer models gives improved sampling algorithms for spin systems at low temperatures on expander graphs and unbalanced bipartite graphs, among other applications.
- [1] . 2019. Mixing time bounds for graphlet random walks. Inform. Process. Lett. 152 (2019), 105851.Google ScholarCross Ref
- [2] Konrad Anand, Andreas Göbel, Marcus Pappik, and Will Perkins. 2023. Perfect sampling for hard spheres from strong spatial mixing. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM’23), Vol. 275. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, 38:1–38:18.Google Scholar
- [3] Konrad Anand and Mark Jerrum. 2022. Perfect sampling in infinite spin systems via strong spatial mixing. SIAM J. Comput. 51, 4 (2022), 1280–1295.Google Scholar
- [4] . 2021. Spectral independence in high-dimensional expanders and applications to the hardcore model. SIAM J. Comput.0 (2021), FOCS20–1.Google ScholarCross Ref
- [5] . 2007. Graph animals, subgraph sampling, and motif search in large networks. Physical Review E 76, 3 (2007), 036107.Google ScholarCross Ref
- [6] . 2012. Guise: Uniform sampling of graphlets for large graph analysis. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining. IEEE, 91–100.Google ScholarDigital Library
- [7] . 2020. Efficient sampling and counting algorithms for the Potts model on \(\mathbb {Z}^d\) at all temperatures. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC’20). 738–751.Google ScholarDigital Library
- [8] . 2013. Left and right convergence of graphs with bounded degree. Random Structures & Algorithms 42, 1 (2013), 1–28.Google ScholarDigital Library
- [9] . 1989. A unified approach to phase diagrams in field theory and statistical mechanics. Communications in Mathematical Physics 123, 2 (1989), 305–328.Google ScholarCross Ref
- [10] . 2021. Efficient and near-optimal algorithms for sampling connected subgraphs. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC’21). 1132–1143.Google ScholarDigital Library
- [11] . 2017. Counting graphlets: Space vs time. In Proceedings of the 10th ACM International Conference on Web Search and Data Mining (WSDM’17). 557–566.Google ScholarDigital Library
- [12] . 2018. Motif counting beyond five nodes. ACM Transactions on Knowledge Discovery from Data (TKDD) 12, 4 (2018), 1–25.Google ScholarDigital Library
- [13] . 2021. Faster motif counting via succinct color coding and adaptive sampling. ACM Transactions on Knowledge Discovery from Data (TKDD) 15, 6 (2021), 1–27.Google ScholarDigital Library
- [14] . 2020. Counting independent sets in unbalanced bipartite graphs. In Proceedings of the Thirty-First Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’20). 1456–1466.Google ScholarDigital Library
- [15] . 2020. Efficient algorithms for the Potts model on small-set expanders. arXiv:2003.01154. Retrieved from https://arxiv.org/abs/2003.01154Google Scholar
- [16] . 2016. A general framework for estimating graphlet statistics via random walk. Proceedings of the VLDB Endowment 10, 3 (2016), 253–264.Google ScholarDigital Library
- [17] . 2021. Fast algorithms at low temperatures via Markov chains. Random Structures & Algorithms 58, 2 (2021), 294–321.Google ScholarCross Ref
- [18] . 2022. Sampling colorings and independent sets of random regular bipartite graphs in the non-uniqueness region. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’22). SIAM, 2198–2207.Google ScholarCross Ref
- [19] . 2020. Rapid mixing of Glauber dynamics up to uniqueness via contraction. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS’20). IEEE, 1307–1318.Google Scholar
- [20] . 2021. Optimal mixing of Glauber dynamics: Entropy factorization via high-dimensional expansion. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC’21). 1537–1550.Google ScholarDigital Library
- [21] . 2020. Statistical physics approaches to unique games. In Proceedings of the 35th Computational Complexity Conference (CCC’20).Google ScholarDigital Library
- [22] . 2004. The relative complexity of approximate counting problems. Algorithmica 38, 3 (2004), 471–500.Google ScholarCross Ref
- [23] . 2017. Statistical Mechanics of Lattice Systems: A Concrete Mathematical Introduction. Cambridge University Press.Google ScholarCross Ref
- [24] . 2020. Polymer dynamics via cliques: New conditions for approximations. arXiv:2007.08293. Retrieved from https://arxiv.org/abs/2007.08293Google Scholar
- [25] . 2021. Fast algorithms for general spin systems on bipartite expanders. ACM Transactions on Computation Theory (TOCT) 13, 4 (2021), 1–18.Google ScholarDigital Library
- [26] . 2022. Fast mixing via polymers for random graphs with unbounded degree. Information and Computation 285, Part B (2022), 104894.Google ScholarDigital Library
- [27] . 2016. Inapproximability of the partition function for the antiferromagnetic Ising and hard-core models. Combinatorics, Probability and Computing 25, 4 (2016), 500–559.Google ScholarCross Ref
- [28] . 1997. Property testing in bounded degree graphs. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing (STOC’97). 406–415.Google ScholarDigital Library
- [29] . 2007. Network motif discovery using subgraph enumeration and symmetry-breaking. In Annual International Conference on Research in Computational Molecular Biology. Springer, 92–106.Google ScholarCross Ref
- [30] . 1971. General properties of polymer systems. Communications in Mathematical Physics 22, 2 (1971), 133–161.Google ScholarCross Ref
- [31] . 1999. On exact simulation of Markov random fields using coupling from the past. Scandinavian Journal of Statistics 26, 3 (1999), 395–411.Google ScholarCross Ref
- [32] . 2022. Sampling Lovász local lemma for general constraint satisfaction solutions in near-linear time. In Proceedings of the 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS’22). IEEE, 147–158.Google ScholarCross Ref
- [33] . 2023. Improved bounds for sampling solutions of random CNF formulas. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’23). SIAM, 3330–3361.Google ScholarCross Ref
- [34] Tyler Helmuth, Matthew Jenssen, and Will Perkins. 2023. Finite-size scaling, phase coexistence, and algorithms for the random cluster model on random graphs. In Annales de l’Institut Henri Poincare (B) Probabilites et statistiques, Vol. 59. Institut Henri Poincaré, 817–848.Google Scholar
- [35] . 2020. Algorithmic Pirogov–Sinai theory. Probability Theory and Related Fields 176, 3 (2020), 851–895.Google ScholarCross Ref
- [36] . 2004. Perfect sampling using bounding chains. The Annals of Applied Probability 14, 2 (2004), 734–753.Google ScholarCross Ref
- [37] . 2011. Random Graphs. John Wiley & Sons.Google Scholar
- [38] . 2020. Algorithms for #BIS-hard problems on expander graphs. SIAM J. Comput. 49, 4 (2020), 681–710.Google ScholarDigital Library
- [39] . 2022. Approximately counting independent sets in bipartite graphs via graph containers. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’22). SIAM, 499–516.Google ScholarCross Ref
- [40] . 2015. Path sampling: A fast and provable method for estimating 4-vertex subgraph counts. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). 495–505.Google ScholarDigital Library
- [41] . 2013. A generalization of the Catalan numbers. Journal of Integer Sequences 16, 2 (2013), 3.Google Scholar
- [42] . 2004. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20, 11 (2004), 1746–1758.Google ScholarDigital Library
- [43] . 1986. Cluster expansion for abstract polymer models. Communications in Mathematical Physics 103, 3 (1986), 491–498.Google ScholarCross Ref
- [44] . 2019. First-hitting times under drift. Theoretical Computer Science 796 (2019), 51–69.Google ScholarDigital Library
- [45] . 2019. Counting independent sets and colorings on random regular bipartite graphs. In Proceedings of the Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM’19). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.Google Scholar
- [46] . 2012. Sampling connected induced subgraphs uniformly at random. In Proceedings of the International Conference on Scientific and Statistical Database Management. Springer, 195–212.Google ScholarDigital Library
- [47] . 2020. Improved mixing time for k-subgraph sampling. In Proceedings of the 2020 SIAM International Conference on Data Mining. SIAM, 568–576.Google ScholarCross Ref
- [48] . 2019. Estimating graphlet statistics via lifting. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 587–595.Google ScholarDigital Library
- [49] . 1996. Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures & Algorithms 9, 1-2 (1996), 223–252.Google ScholarDigital Library
- [50] . 2021. The hardness of sampling connected subgraphs. In Proceedings of the Latin American Symposium on Theoretical Informatics. Springer, 464–475.Google Scholar
- [51] . 2009. Efficient graphlet kernels for large graph comparison. In Proceedings of the Artificial Intelligence and Statistics. PMLR, 488–495.Google Scholar
- [52] . 2010. Computational transition at the uniqueness threshold. In Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science (FOCS’10). IEEE, 287–296.Google ScholarDigital Library
- [53] . 2012. The computational hardness of counting in two-spin models on d-regular graphs. In Proceedings of the 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science. IEEE, 361–369.Google ScholarDigital Library
- [54] . 2006. Counting independent sets up to the tree threshold. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing. 140–149.Google ScholarDigital Library
- [55] . 2005. Fast simulation of new coins from old. The Annals of Applied Probability 15, 1A (2005), 93–115.Google Scholar
Index Terms
- Fast and Perfect Sampling of Subgraphs and Polymer Systems
Recommendations
On the existence of vertex-disjoint subgraphs with high degree sum
For a graph G, we denote by 2(G) the minimum degree sum of two non-adjacent vertices if G is non-complete; otherwise, 2(G)=+. In this paper, we prove the following two results: (i) If s1,s22 are integers and G is a non-complete graph with 2(G)2(s1+s2+1)...
Detecting induced subgraphs
An s-graph is a graph with two kinds of edges: subdivisible edges and real edges. A realisation of an s-graph B is any graph obtained by subdividing subdivisible edges of B into paths of arbitrary length (at least one). Given an s-graph B, we study the ...
Sublinear-Time Algorithms for Counting Star Subgraphs via Edge Sampling
We study the problem of estimating the value of sums of the form $$S_p \triangleq \sum \left( {\begin{array}{c}x_i\\ p\end{array}}\right) $$Spźźxip when one has the ability to sample $$x_i \ge 0$$xiź0 with probability proportional to its magnitude. When ...
Comments