当前位置: X-MOL 学术Adv. Data Anal. Classif. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Clustering ensemble extraction: a knowledge reuse framework
Advances in Data Analysis and Classification ( IF 1.6 ) Pub Date : 2024-03-27 , DOI: 10.1007/s11634-024-00588-4
Mohaddeseh Sedghi , Ebrahim Akbari , Homayun Motameni , Touraj Banirostam

Clustering ensemble combines several fundamental clusterings with a consensus function to produce the final clustering without gaining access to data features. The quality and diversity of a vast library of base clusterings influence the performance of the consensus function. When a huge library of various clusterings is not available, this function produces results of lower quality than those of the basic clustering. The expansion of diverse clusters in the collection to increase the performance of consensus, especially in cases where there is no access to specific data features or assumptions in the data distribution, has still remained an open problem. The approach proposed in this paper, Clustering Ensemble Extraction, considers the similarity criterion at the cluster level and places the most similar clusters in the same group. Then, it extracts new clusters with the help of the Extracting Clusters Algorithm. Finally, two new consensus functions, namely Cluster-based extracted partitioning algorithm and Meta-cluster extracted algorithm, are defined and then applied to new clusters in order to create a high-quality clustering. The results of the empirical experiments conducted in this study showed that the new consensus function obtained by our proposed method outperformed the methods previously proposed in the literature regarding the clustering quality and efficiency.



中文翻译:

聚类集成提取:知识重用框架

聚类集成将几个基本聚类与共识函数相结合,以产生最终的聚类,而无需访问数据特征。庞大的基础聚类库的质量和多样性会影响共识功能的性能。当各种聚类的巨大库不可用时,该函数产生的结果质量低于基本聚类的结果。扩展集合中的不同集群以提高共识性能,特别是在无法访问数据分布中的特定数据特征或假设的情况下,仍然是一个悬而未决的问题。本文提出的方法“聚类集成提取”考虑了聚类级别的相似性标准,并将最相似的聚类放在同一组中。然后,它借助提取聚类算法提取新的聚类。最后,定义了两个新的共识函数,即基于簇的提取划分算法和元簇提取算法,然后将其应用于新的簇,以创建高质量的聚类。本研究进行的实证实验结果表明,我们提出的方法获得的新共识函数在聚类质量和效率方面优于文献中先前提出的方法。

更新日期:2024-03-27
down
wechat
bug