Abstract
An inherent limitation of many popular community detection methods, such as the walktrap and spin glass algorithms, is that they do not allow vertices to have membership in more than one community. Clique percolation remedies this limitation by allowing overlapping communities but does not necessarily produce solutions in accordance with the standard definition of ‘community’ (i.e., a dense subgraph of the network), often fails to assign all vertices to at least one community and presents formidable model selection challenges. In this paper, we propose a set-covering approach to overlapping community detection that enables overlapping communities to be assembled from maximal cliques, or from candidate communities formed from k-1 adjacent cliques. The promise of this new approach is demonstrated via comparison to clique percolation in a simulation experiment, as well as through application to an empirical psychological network.
Similar content being viewed by others
Data Availability
MATLAB code for implementing the proposed method and replicating the simulation experiment is available from the Figshare repository at: https://figshare.com/articles/software/MATLAB_OCD_Files/21647615. The k-clique.m program for clique percolation is available from the MATLAB file exchange at: https://www.mathworks.com/matlabcentral/fileexchange/34202-k-clique-algorithm. The maximalCliques.m program for the Bron-Kerbosch algorithm is available from the MATLAB file exchange at: https://www.mathworks.com/matlabcentral/fileexchange/30413-bron-kerbosch-maximal-clique-finding-algorithm.
Notes
A clique is a complete subgraph of vertices. That is, a subset of vertices is a clique if there are edges connecting all pairs of vertices in the subset.
The density of a graph (or subgraph) is the number of edges in the graph divided by the number of possible edges. If there are n vertices in the graph (or subgraph), then (n(n-1))/2 is the number of possible edges.
Two cliques, \({C}_{l}^{1}\) and \({C}_{k}^{2}\) where l > k are also said to be k-α adjacent if they share k-α common vertices.
As noted in [22], some of the edge weights in a k-clique might be less than the intensity threshold, yet the k-clique is still retained if its intensity exceeds I.
We assume a connected graph, which is almost always the case in our application domain. If there are isolates in the network, then we keep the largest connected component of the graph.
References
Cramer, A.O., Waldorp, L.J., van der Maas, H.L., Borsboom, D.: Complex realities require complex theories: refining and extending the network approach to mental disorders. Behav. Brain Sci. 33, 178–193 (2010)
Brusco, M. J., Steinley, D., Watts, A.L.: A comparison of logistic regression methods for Ising model estimation. Behav. Res. Meth. https://doi.org/10.3758/s13428-022-01976-4 (2022)
van Borkulo, C.D., Borsboom, D., Epskamp, S., Blanken, T.F., Boschloo, L., Schoevers, R.A., Waldorp, L.J.: A new method for constructing networks from binary data. Sci. Rep. 4, 1–8 (2014)
Williams, D.R., Rast, P.: Back to the basics: rethinking partial correlation network methodology. Brit. J. Math. Stat. Psych. 73, 187–212 (2020)
Friedman, J.H., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441 (2008)
Brusco, M. J., Steinley, D., Watts, A. L.: On maximization of the modularity index in network psychometrics. Behav. Res. Meth. https://doi.org/10.3758/s13428-022-01975-5 (2022)
Epskamp, S., Fried, E.I.: A tutorial on regularized partial correlation networks. Psych. Meth. 23, 617–634 (2018)
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004)
Aloise, D., Cafieri, S., Caporossi, G., Hansen, P., Perron, S., Liberti, L.: Column generation algorithms for exact modularity maximization in networks. Phys. Rev. E 82, 046112 (2010)
Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 10008. https://doi.org/10.1088/1742-5468/2008/10/P10008 (2008)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99, 7821–7826 (2002)
Miyauchi, A., Sukegawa, N.: Redundant constraints in the standard formulation for the clique partitioning problem. Optim. Lett. 9, 199–207 (2015)
Miyauchi, A., Sukegawa, N.: Maximizing Barber’s bipartite modularity is also hard. Optim. Lett. 9, 897–913 (2015)
Newman, M.E.J.: Fast algorithm for detecting com. Munity structure in networks. Phys. Rev. E 69, 066133 (2004)
Newman, M.E.J.: Analysis of weighted networks. Phys. Rev. E 70, 056131 (2004)
Bhasker, J., Samad, T.: The clique partitioning problem. Comp. Math. Appl. 22, 1–11 (1991)
Pons, P., Latapy, M.: Computing communities in large networks using random walks. J. Graph Alg. Appl. 10, 191–218 (2006)
Reichardt, J., Bornholdt, S.: Statistical mechanics of community detection. Phys. Rev. E 74, 016110. https://doi.org/10.1103/PhysRevE.74.016110 (2006)
Gates, K.M., Henry, T., Steinley, D., Fair, D.A.: A Monte Carlo evaluation of weighted community detection algorithms. Front. Neuroinformatics 10, 45 (2016). https://doi.org/10.3389/fninf.2016.00045
Hoffman, M., Steinley, D., Gates, K.M., Prinstein, M.J., Brusco, M.J.: Detecting clusters/communities in social networks. Mult. Behav. Res. 53, 57–73 (2018)
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814–818 (2005)
Farkas, I.J., Ábel, D., Palla, G., Vicsek, T.: Weighted network modules. New J. Phys. 9, 1–18 (2007)
Fortunato, S.: Community detection in graphs. Phys. Rep. 486, 75–174 (2010)
Toregas, C., ReVelle, C.S.: Optimal location under time or distance constraints. Papers Reg. Sci. Assoc. 28, 133–144 (1972)
Toregas, C., Swain, R., ReVelle, C.S., Bergman, L.: The location of emergency service facilities. Oper. Res. 19, 1363–1373 (1971)
Adamcsek, B., Palla, G., Farkas, I.J., Derényi, I., Vicsek, T.: CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics 22, 1021–1023 (2006)
Harary, F., Ross, I.C.: A procedure for clique detection using the group matrix. Sociometry 20, 205–215 (1957)
Bomze, I.M., Budinich, M., Pardalos, P.M., Pelillo, M.: The maximum clique problem. In: Du, D.-Z., Pardalos, P.M. (eds.) Handbook of Combinatorial Optimization, vol. 4, pp. 1–74. Kluwer, Boston (1999)
Pardalos, P.M., Xue, J.: The maximum clique problem. J. Glob. Opt. 4, 301–328 (1994)
Vogiatzis, C., Veremyev, A., Pasiliao, E.L., Pardalos, P.M.: An integer programming approach for finding the most and the least central cliques. Optim. Lett. 9, 615–633 (2015)
Wildman, J. Bron-Kerbosch maximal clique finding algorithm (https://www.mathworks.com/matlabcentral/fileexchange/30413-bron-kerbosch-maximal-clique-finding-algorithm), MATLAB Central File Exchange. Retrieved April 10, (2023).
Bron, C., Kerbosch, J.: Algorithm 457: finding all cliques of an undirected graph. Comm. ACM 16, 575–577 (1973)
Nguyen, A.-D.: k-clique algorithm, (https://www.mathworks.com/matlabcentral/fileexchange/34202-k-clique-algorithm), MATLAB central file exchange. Retrieved November 22, (2022)
MATLAB.: version 9.8.0 (R2020a). Natick, Massachusetts: The MathWorks Inc. (2020)
Lancichinetti, A., Fortunato, S.: Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E 80, 016118 (2009)
Collins, L.M., Dent, C.W.: Omega: a general formulation of the Rand index of cluster recovery suitable for non-disjoint solutions. Mult. Behav. Res. 23, 231–242 (1988)
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 191–212 (1985)
Grant, B.F., Goldstein, R.B., Saha, T.D., Chou, S.P., Jung, J., Zhang, H., Pickering, R.P., Ruan, W.J., Smith, S.M., Huang, B., Hasin, D.S.: Epidemiology of DSM-5 alcohol use disorder: results from the national epidemiologic survey on alcohol and related conditions III. JAMA Psych. 72, 757–766 (2015)
Csardi, G., Nepusz, T.: The igraph software package for complex network research. Inter. J. Complex Sys. 1695, 1–9 (2006)
Epskamp, S., Cramer, A.O., Waldorp, L.J., Schmittmann, V.D., Borsboom, D.: qgraph: network visualizations of relationships in psychometric data. J. Stat. Soft. 48, 1–18 (2012)
Lange, J.: R package “CliquePercolation”: clique percolation for networks. (https://cran.r-project.org/web/packages/CliquePercolation/CliquePercolation.pdf). Retrieved 4/12/2023 (2013)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Brusco, M.J., Steinley, D. & Watts, A.L. A maximal-clique-based set-covering approach to overlapping community detection. Optim Lett (2023). https://doi.org/10.1007/s11590-023-02054-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11590-023-02054-0