Skip to main content
Log in

Clustering ensemble extraction: a knowledge reuse framework

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Clustering ensemble combines several fundamental clusterings with a consensus function to produce the final clustering without gaining access to data features. The quality and diversity of a vast library of base clusterings influence the performance of the consensus function. When a huge library of various clusterings is not available, this function produces results of lower quality than those of the basic clustering. The expansion of diverse clusters in the collection to increase the performance of consensus, especially in cases where there is no access to specific data features or assumptions in the data distribution, has still remained an open problem. The approach proposed in this paper, Clustering Ensemble Extraction, considers the similarity criterion at the cluster level and places the most similar clusters in the same group. Then, it extracts new clusters with the help of the Extracting Clusters Algorithm. Finally, two new consensus functions, namely Cluster-based extracted partitioning algorithm and Meta-cluster extracted algorithm, are defined and then applied to new clusters in order to create a high-quality clustering. The results of the empirical experiments conducted in this study showed that the new consensus function obtained by our proposed method outperformed the methods previously proposed in the literature regarding the clustering quality and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Algorithm 2
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.ics.uci.com/mlearn/MLRespository.html.

References

  • Akbari E et al (2015) Hierarchical cluster ensemble selection. Eng Appl Artif Intell 39:146–156

    Article  MathSciNet  Google Scholar 

  • Alizadeh H, Minaei B, Parvin H (2014) Cluster ensemble selection based on a new cluster stability measure. Intell Data Anal 18:389–408

    Article  Google Scholar 

  • Alizadeh H, Yousefnezhad M, Minaei B (2015) Wisdom of crowds cluster ensemble. Intell Data Anal 19:485–503

    Article  Google Scholar 

  • Azimi J, Fern X (2009) Adaptive cluster ensemble selection. In: Proceedings of the 21st international joint conference on artificial intelligence. Morgan Kaufmann Publishers Inc., Pasadena, California, USA, pp 992–997

  • Bagherinia A et al (2020) Reliability-based fuzzy clustering ensemble. Fuzzy Sets Syst 413:1–28

    Article  MathSciNet  Google Scholar 

  • Bai L et al (2019) An information-theoretical framework for cluster ensemble. IEEE Trans Knowl Data Eng 31:1464–1477

    Google Scholar 

  • Banerjee A et al (2021) A new method for weighted ensemble clustering and coupled ensemble selection. Connect Sci 33:1–22

    Article  Google Scholar 

  • Elghazel H, Aussem A (2015) Unsupervised feature selection with ensemble learning. Mach Learn 98(1–2):157–180

    Article  MathSciNet  Google Scholar 

  • Fern XZ, Brodley CE (2004, July) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the twenty-first international conference on Machine learning, p 36

  • Fern XZ, Lin W (2008) Cluster ensemble selection. In: Proceedings of the 2008 SIAM international conference on data mining (SDM). Society for Industrial and Applied Mathematics, pp 787–797

  • Fozieh Asghari P, Saber N, Muhammad Y (2017) Wised semi-supervised cluster ensemble selection: a new framework for selecting and combing multiple partitions based on prior knowledge. J Adv Comput Res 8(1):67–88

    Google Scholar 

  • Fred A, Jain A (2005) Combining Multiple Clusterings Using Evidence Accumulation. IEEE Trans Pattern Anal Mach Intell 27:835–850

    Article  Google Scholar 

  • Hadjitodorov ST, Kuncheva LI, Todorova LP (2006) Moderate diversity for better cluster ensembles. Information Fusion 7(3):264–275

    Article  Google Scholar 

  • Hamidi SS, Akbari E, Motameni H (2019) Consensus clustering algorithm based on the automatic partitioning similarity graph. Data Knowl Eng 124:101754

    Article  Google Scholar 

  • He Z, Xu X, Deng S (2005) A cluster ensemble method for clustering categorical data. Information Fusion 6(2):143–151

    Article  Google Scholar 

  • Huang D, Wang C-D, Lai J-H (2018) Locally weighted ensemble clustering. IEEE Trans Cybern 48:1460–1473

    Article  Google Scholar 

  • Iam-On N et al (2012) A Link-Based Cluster Ensemble Approach for Categorical Data Clustering. IEEE Trans Knowl Data Eng 24(3):413–425

    Article  Google Scholar 

  • Jia J et al (2011) Bagging-based spectral clustering ensemble selection. Pattern Recogn Lett 32(10):1456–1467

    Article  Google Scholar 

  • Jing L, Tian K, Huang J (2015) Stratified feature sampling method for ensemble clustering of high dimensional data. Pattern Recogn 48:3688–3702

    Article  Google Scholar 

  • Karypis G, Kumar V (1998) Multilevel k-way partitioning scheme for irregular graphs. J Parallel Distrib Comput 48(1):96–129

    Article  Google Scholar 

  • Kuhn HW (1955) The Hungarian method for the assignment problem. Naval Research Logistics Quarterly 2(1–2):83–97

    Article  MathSciNet  Google Scholar 

  • Kuncheva L, Hadjitodorov S (2004) Using diversity in cluster ensembles, vol 2, pp 1214–1219

  • Li F et al (2018) Cluster’s quality evaluation and selective clustering ensemble. ACM Trans Knowl Discov Data 12:1–27

    Article  Google Scholar 

  • Li F et al (2019) Clustering ensemble based on sample’s stability. Artif Intell 273:37–55

    Article  MathSciNet  Google Scholar 

  • Li T, Rezaeipanah A, Tag El Din EM (2022) An ensemble agglomerative hierarchical clustering algorithm based on clusters clustering technique and the novel similarity measurement. J King Saud Univ Comput Inf Sci 34(6, Part B):3828–3842

    Google Scholar 

  • Lourenco A et al (2013) Probabilistic consensus clustering using evidence accumulation. Mach Learn 98:331–357

    Article  MathSciNet  Google Scholar 

  • Ma T et al (2020) Multiple clustering and selecting algorithms with combining strategy for selective clustering ensemble. Soft Comput 24(20):15129–15141

    Article  Google Scholar 

  • Mahmoudi MR et al (2021) Consensus function based on cluster-wise two level clustering. Artif Intell Rev 54(1):639–665

    Article  Google Scholar 

  • Minaei B et al (2014) 2.02. Effects of resampling method and adaptation on clustering ensemble efficacy. Artif Intell Rev 41:27–48

    Article  Google Scholar 

  • Naldi M, Carvalho A, Campello RJGB (2013) Cluster ensemble selection based on relative validity indexes. Data Min Knowl Disc 27:259–289

    Article  MathSciNet  Google Scholar 

  • Parvin H et al (2012) 2.03. A new classifier ensemble methodology based on subspace learning. J Exp Theor Artif Intell 25:1–27

    Google Scholar 

  • Saidi M et al (2017) Instances selection algorithm by ensemble margin. J Exp Theor Artif Intell 30:1–22

    Google Scholar 

  • Strehl A, Ghosh J (2002) Cluster Ensembles - A Knowledge Reuse Framework for Combining Multiple Partitions. J Mach Learn Res 3:583–617

    MathSciNet  Google Scholar 

  • Sulaiman NH, Mohamad D (2012) A Jaccard-based similarity measure for soft sets. In: 2012 IEEE symposium on humanities, science and engineering research

  • Tan P-N, Steinbach M, Kumar V (2016) Introduction to data mining. Pearson Education India

  • Topchy A, Jain AK, Punch W (2004) A mixture model for clustering ensembles. In: Proceedings of the 2004 SIAM international conference on data mining (SDM). Society for Industrial and Applied Mathematics, pp 379–390

  • Topchy A, Jain AK, Punch W (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Anal Mach Intell 27(12):1866–1881

    Article  Google Scholar 

  • Wang L et al (2022) Markov clustering ensemble. Knowl-Based Syst 251:109196

    Article  Google Scholar 

  • Yang Y, Chen K (2011) Temporal Data Clustering via Weighted Clustering Ensemble with Different Representations. Knowledge and Data Engineering, IEEE Transactions on 23:307–320

    Article  Google Scholar 

  • Yang F et al (2017) Cluster ensemble selection with constraints. Neurocomputing 235:59–70

    Article  Google Scholar 

  • Yousefnezhad M et al (2016) A new selection strategy for selective cluster ensemble based on Diversity and Independency. Eng Appl Artif Intell 56:260–272

    Article  Google Scholar 

  • Yu Z et al (2015) Adaptive Noise Immune Cluster Ensemble Using Affinity Propagation. IEEE Trans Knowl Data Eng 27(12):3176–3189

    Article  Google Scholar 

  • Zhao X, Cao F, Liang J (2018) A sequential ensemble clusterings generation algorithm for mixed data. Appl Math Comput 335:264–277

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ebrahim Akbari.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sedghi, M., Akbari, E., Motameni, H. et al. Clustering ensemble extraction: a knowledge reuse framework. Adv Data Anal Classif (2024). https://doi.org/10.1007/s11634-024-00588-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11634-024-00588-4

Keywords

Mathematics Subject Classification

Navigation