Skip to main content
Log in

A maximal-clique-based set-covering approach to overlapping community detection

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

An inherent limitation of many popular community detection methods, such as the walktrap and spin glass algorithms, is that they do not allow vertices to have membership in more than one community. Clique percolation remedies this limitation by allowing overlapping communities but does not necessarily produce solutions in accordance with the standard definition of ‘community’ (i.e., a dense subgraph of the network), often fails to assign all vertices to at least one community and presents formidable model selection challenges. In this paper, we propose a set-covering approach to overlapping community detection that enables overlapping communities to be assembled from maximal cliques, or from candidate communities formed from k-1 adjacent cliques. The promise of this new approach is demonstrated via comparison to clique percolation in a simulation experiment, as well as through application to an empirical psychological network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availability

MATLAB code for implementing the proposed method and replicating the simulation experiment is available from the Figshare repository at: https://figshare.com/articles/software/MATLAB_OCD_Files/21647615. The k-clique.m program for clique percolation is available from the MATLAB file exchange at: https://www.mathworks.com/matlabcentral/fileexchange/34202-k-clique-algorithm. The maximalCliques.m program for the Bron-Kerbosch algorithm is available from the MATLAB file exchange at: https://www.mathworks.com/matlabcentral/fileexchange/30413-bron-kerbosch-maximal-clique-finding-algorithm.

Notes

  1. A clique is a complete subgraph of vertices. That is, a subset of vertices is a clique if there are edges connecting all pairs of vertices in the subset.

  2. The density of a graph (or subgraph) is the number of edges in the graph divided by the number of possible edges. If there are n vertices in the graph (or subgraph), then (n(n-1))/2 is the number of possible edges.

  3. Two cliques, \({C}_{l}^{1}\) and \({C}_{k}^{2}\) where l > k are also said to be kadjacent if they share k-α common vertices.

  4. As noted in [22], some of the edge weights in a k-clique might be less than the intensity threshold, yet the k-clique is still retained if its intensity exceeds I.

  5. We assume a connected graph, which is almost always the case in our application domain. If there are isolates in the network, then we keep the largest connected component of the graph.

References

  1. Cramer, A.O., Waldorp, L.J., van der Maas, H.L., Borsboom, D.: Complex realities require complex theories: refining and extending the network approach to mental disorders. Behav. Brain Sci. 33, 178–193 (2010)

    Article  Google Scholar 

  2. Brusco, M. J., Steinley, D., Watts, A.L.: A comparison of logistic regression methods for Ising model estimation. Behav. Res. Meth. https://doi.org/10.3758/s13428-022-01976-4 (2022)

  3. van Borkulo, C.D., Borsboom, D., Epskamp, S., Blanken, T.F., Boschloo, L., Schoevers, R.A., Waldorp, L.J.: A new method for constructing networks from binary data. Sci. Rep. 4, 1–8 (2014)

    Google Scholar 

  4. Williams, D.R., Rast, P.: Back to the basics: rethinking partial correlation network methodology. Brit. J. Math. Stat. Psych. 73, 187–212 (2020)

    Article  Google Scholar 

  5. Friedman, J.H., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441 (2008)

    Article  MATH  Google Scholar 

  6. Brusco, M. J., Steinley, D., Watts, A. L.: On maximization of the modularity index in network psychometrics. Behav. Res. Meth. https://doi.org/10.3758/s13428-022-01975-5 (2022)

  7. Epskamp, S., Fried, E.I.: A tutorial on regularized partial correlation networks. Psych. Meth. 23, 617–634 (2018)

    Article  Google Scholar 

  8. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004)

    Article  Google Scholar 

  9. Aloise, D., Cafieri, S., Caporossi, G., Hansen, P., Perron, S., Liberti, L.: Column generation algorithms for exact modularity maximization in networks. Phys. Rev. E 82, 046112 (2010)

    Article  Google Scholar 

  10. Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 10008. https://doi.org/10.1088/1742-5468/2008/10/P10008 (2008)

  11. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99, 7821–7826 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  12. Miyauchi, A., Sukegawa, N.: Redundant constraints in the standard formulation for the clique partitioning problem. Optim. Lett. 9, 199–207 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  13. Miyauchi, A., Sukegawa, N.: Maximizing Barber’s bipartite modularity is also hard. Optim. Lett. 9, 897–913 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  14. Newman, M.E.J.: Fast algorithm for detecting com. Munity structure in networks. Phys. Rev. E 69, 066133 (2004)

    Article  Google Scholar 

  15. Newman, M.E.J.: Analysis of weighted networks. Phys. Rev. E 70, 056131 (2004)

    Article  Google Scholar 

  16. Bhasker, J., Samad, T.: The clique partitioning problem. Comp. Math. Appl. 22, 1–11 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  17. Pons, P., Latapy, M.: Computing communities in large networks using random walks. J. Graph Alg. Appl. 10, 191–218 (2006)

    MathSciNet  MATH  Google Scholar 

  18. Reichardt, J., Bornholdt, S.: Statistical mechanics of community detection. Phys. Rev. E 74, 016110. https://doi.org/10.1103/PhysRevE.74.016110 (2006)

  19. Gates, K.M., Henry, T., Steinley, D., Fair, D.A.: A Monte Carlo evaluation of weighted community detection algorithms. Front. Neuroinformatics 10, 45 (2016). https://doi.org/10.3389/fninf.2016.00045

    Article  Google Scholar 

  20. Hoffman, M., Steinley, D., Gates, K.M., Prinstein, M.J., Brusco, M.J.: Detecting clusters/communities in social networks. Mult. Behav. Res. 53, 57–73 (2018)

    Article  Google Scholar 

  21. Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814–818 (2005)

    Article  Google Scholar 

  22. Farkas, I.J., Ábel, D., Palla, G., Vicsek, T.: Weighted network modules. New J. Phys. 9, 1–18 (2007)

    Article  MathSciNet  Google Scholar 

  23. Fortunato, S.: Community detection in graphs. Phys. Rep. 486, 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  24. Toregas, C., ReVelle, C.S.: Optimal location under time or distance constraints. Papers Reg. Sci. Assoc. 28, 133–144 (1972)

    Article  Google Scholar 

  25. Toregas, C., Swain, R., ReVelle, C.S., Bergman, L.: The location of emergency service facilities. Oper. Res. 19, 1363–1373 (1971)

    Article  MATH  Google Scholar 

  26. Adamcsek, B., Palla, G., Farkas, I.J., Derényi, I., Vicsek, T.: CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics 22, 1021–1023 (2006)

    Article  Google Scholar 

  27. Harary, F., Ross, I.C.: A procedure for clique detection using the group matrix. Sociometry 20, 205–215 (1957)

    Article  MathSciNet  Google Scholar 

  28. Bomze, I.M., Budinich, M., Pardalos, P.M., Pelillo, M.: The maximum clique problem. In: Du, D.-Z., Pardalos, P.M. (eds.) Handbook of Combinatorial Optimization, vol. 4, pp. 1–74. Kluwer, Boston (1999)

    Google Scholar 

  29. Pardalos, P.M., Xue, J.: The maximum clique problem. J. Glob. Opt. 4, 301–328 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  30. Vogiatzis, C., Veremyev, A., Pasiliao, E.L., Pardalos, P.M.: An integer programming approach for finding the most and the least central cliques. Optim. Lett. 9, 615–633 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  31. Wildman, J. Bron-Kerbosch maximal clique finding algorithm (https://www.mathworks.com/matlabcentral/fileexchange/30413-bron-kerbosch-maximal-clique-finding-algorithm), MATLAB Central File Exchange. Retrieved April 10, (2023).

  32. Bron, C., Kerbosch, J.: Algorithm 457: finding all cliques of an undirected graph. Comm. ACM 16, 575–577 (1973)

    Article  MATH  Google Scholar 

  33. Nguyen, A.-D.: k-clique algorithm, (https://www.mathworks.com/matlabcentral/fileexchange/34202-k-clique-algorithm), MATLAB central file exchange. Retrieved November 22, (2022)

  34. MATLAB.: version 9.8.0 (R2020a). Natick, Massachusetts: The MathWorks Inc. (2020)

  35. Lancichinetti, A., Fortunato, S.: Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E 80, 016118 (2009)

    Article  Google Scholar 

  36. Collins, L.M., Dent, C.W.: Omega: a general formulation of the Rand index of cluster recovery suitable for non-disjoint solutions. Mult. Behav. Res. 23, 231–242 (1988)

    Article  Google Scholar 

  37. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 191–212 (1985)

    Article  MATH  Google Scholar 

  38. Grant, B.F., Goldstein, R.B., Saha, T.D., Chou, S.P., Jung, J., Zhang, H., Pickering, R.P., Ruan, W.J., Smith, S.M., Huang, B., Hasin, D.S.: Epidemiology of DSM-5 alcohol use disorder: results from the national epidemiologic survey on alcohol and related conditions III. JAMA Psych. 72, 757–766 (2015)

    Article  Google Scholar 

  39. Csardi, G., Nepusz, T.: The igraph software package for complex network research. Inter. J. Complex Sys. 1695, 1–9 (2006)

    Google Scholar 

  40. Epskamp, S., Cramer, A.O., Waldorp, L.J., Schmittmann, V.D., Borsboom, D.: qgraph: network visualizations of relationships in psychometric data. J. Stat. Soft. 48, 1–18 (2012)

    Article  Google Scholar 

  41. Lange, J.: R package “CliquePercolation”: clique percolation for networks. (https://cran.r-project.org/web/packages/CliquePercolation/CliquePercolation.pdf). Retrieved 4/12/2023 (2013)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael J. Brusco.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Brusco, M.J., Steinley, D. & Watts, A.L. A maximal-clique-based set-covering approach to overlapping community detection. Optim Lett (2023). https://doi.org/10.1007/s11590-023-02054-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11590-023-02054-0

Keywords

Navigation