Skip to main content
Log in

What’s in a distance? Exploring the interplay between distance measures and internal cluster validity in multi-objective clustering

  • Published:
Natural Computing Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

The problem of cluster analysis eludes a unique mathematical definition. Instead, a variety of different instantiations of the problem can be defined using specific measures of internal cluster validity. In turn, such internal cluster validity measures rely on quantifying dissimilarity between entities. This article explores the interaction between dissimilarity measures and internal cluster validity techniques in the context of multi-objective clustering. It does so by contrasting two conceptually different approaches to multi-objective clustering: the multi-criterion clustering algorithm \(\Delta\)-MOCK, designed to optimise different measures of internal cluster validity over a single dissimilarity space, and the multi-view clustering algorithm MVMC, designed to optimise a single measure of internal cluster validity over distinct dissimilarity spaces. Our comparison highlights the interchangeable roles of distance functions and measures of internal cluster validity, which paves the way for the future design of a flexible, dual-purpose approach to multi-objective clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Aljalbout E, Golkov V, Siddiqui Y, et al (2018) Clustering with deep learning: taxonomy and new methods. arXiv:1801.07648

  • Bayá AE, Granitto PM (2013) How many clusters: a validation index for arbitrary-shaped clusters. IEEE/ACM Trans Comput Biol Bioinf 10(2):401–14

    Article  Google Scholar 

  • de Carvalho F, Lechevallier Y, de Melo FM (2012) Partitioning hard clustering algorithms based on multiple dissimilarity matrices. Pattern Recogn 45(1):447–464

    Article  MATH  Google Scholar 

  • de Carvalho F, Lechevallier Y, Despeyroux T et al (2014) Multi-view clustering on relational data. In: Zighed F, Abdelkader G, Gilles P et al (eds) Advances in knowledge discovery and management. Springer, Heidelberg, pp 37–51

    Chapter  Google Scholar 

  • Delattre M, Hansen P (1980) Bicriterion cluster analysis. IEEE Trans Pattern Anal Mach Intell 2(4):277–291

    Article  MATH  Google Scholar 

  • Garza-Fabre M, Handl J, Knowles J (2018) An improved and more scalable evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 22(4):515–535

    Article  Google Scholar 

  • Handl J, Knowles J (2007) An evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 11(1):56–76

    Article  Google Scholar 

  • Hennig C (2015) What are the true clusters? Pattern Recogn Lett 64:53–62

    Article  MATH  Google Scholar 

  • Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218

    Article  MATH  Google Scholar 

  • José-García A, Gómez-Flores W (2016) Automatic clustering using nature-Inspired metaheuristics: a survey. Appl Soft Comput 41:192–213

    Article  Google Scholar 

  • José-García A, Handl J (2021) On the interaction between distance functions and clustering criteria in multi-objective clustering. In: International conference on evolutionary multi-criterion optimization, Springer, pp 504–515

  • José-García A, Handl J, Gómez-Flores W et al (2019) Many-view clustering: An illustration using multiple dissimilarity measures. In: Press ACM (ed) Genetic and Evolutionary Computation Conference - GECCO ’19. Republic Prague, Czech, pp 213–214

  • José-García A, Handl J, Gómez-Flores W et al (2021) An evolutionary many-objective approach to multiview clustering using feature and relational data. Appl Soft Comput 108:1–15

    Article  Google Scholar 

  • Kanaan-Izquierdo S, Ziyatdinov A, Perera-Lluna A (2018) Multiview and multifeature spectral clustering using common eigenvectors. Pattern Recogn Lett 102:30–36

    Article  Google Scholar 

  • MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley symposium on mathematical statistics and probability. University of California Press, pp 281–297

  • Mukhopadhyay A, Maulik U, Bandyopadhyay S (2015) A survey of multiobjective evolutionary clustering. ACM Comput Surv (CSUR) 47(4):1–46

    Article  Google Scholar 

  • Park Y, Song M (1998) A genetic algorithm for clustering problems. In: Proceedings of the Third Annual Conference on Genetic Programming, pp 568–575

  • Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  MATH  Google Scholar 

  • Santos JM, de Sá JM (2005) Human clustering on bi-dimensional data: an assessment. Tech. rep, INEB -Instituto de Engenharia Biomedica

  • Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038

    Article  Google Scholar 

  • Theodoridis S, Koutrumbas K (2009) Pattern recognition, 4th edn. Elsevier Inc, Amsterdam

    Google Scholar 

  • Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Statist Soc Ser B (Statist Methodol) 63(2):411–423

    Article  MathSciNet  MATH  Google Scholar 

  • Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    Article  MathSciNet  Google Scholar 

  • Zhang Q, Li H (2007) MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adán José-García.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

José-García, A., Handl, J. What’s in a distance? Exploring the interplay between distance measures and internal cluster validity in multi-objective clustering. Nat Comput 22, 259–270 (2023). https://doi.org/10.1007/s11047-022-09909-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-022-09909-y

Keywords

Navigation