当前位置: X-MOL 学术Int. J. Uncertain. Fuzziness Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sparse FCM-Based Map-Reduce Framework for Distributed Parallel Data Clustering in E-Khool Learning Platform
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems ( IF 1.5 ) Pub Date : 2023-02-27 , DOI: 10.1142/s0218488523500034
A. Suki Antely 1 , P. Jegatheeswari 1 , M. Bibin Prasad 1 , V. Vinolin 1 , S. Vinusha 1 , B. R. Rajakumar 1 , D. Binu 1
Affiliation  

Parallel clustering serves as a platform for handling big data. The literature displays a number of clustering algorithms using a map-reduce framework, but they did not assure the effective clusters such that knowledge extraction becomes tough. With the aim to render a better and effective data clustering method to analyze the big data arriving from distributed systems, this paper uses a new clustering method. The proposed method named as BatDolphin-based Sparse Fuzzy C-Means (BatDol-Sparse FCM) clustering algorithm is proposed that paves way for the optimal selection of the cluster centroids. The distributed big data is managed using the Map-Reduce framework that is inbuilt with the BatDolphin-based Sparse Fuzzy C-Means algorithm such that the local and global clustering is executed. The implementation of the proposed BatDol-Sparse FCM algorithm is done by using data from the E-khool Learning Management System (LMS) and three medical datasets from the UCI repository. The analysis is performed using the metrics, such as clustering accuracy (CA) and dice coefficient (DC). From the simulation results, it is evident that the proposed parallel clustering scheme provided better results than the existing algorithms with the values of 0.96 and 0.9667 for CA and DC respectively.



中文翻译:

E-Khool 学习平台中基于稀疏 FCM 的分布式并行数据聚类 Map-Reduce 框架

并行集群作为处理大数据的平台。文献中展示了许多使用 map-reduce 框架的聚类算法,但它们并没有保证有效的聚类,因此知识提取变得困难。为了提供一种更好、更有效的数据聚类方法来分析来自分布式系统的大数据,本文使用了一种新的聚类方法。提出了一种名为基于 BatDolphin 的稀疏模糊 C 均值 (BatDol-Sparse FCM) 聚类算法的方法,为聚类质心的优化选择铺平了道路。使用基于 BatDolphin 的稀疏模糊 C-Means 算法内置的 Map-Reduce 框架管理分布式大数据,从而执行局部和全局聚类。拟议的 BatDol-Sparse FCM 算法的实施是通过使用来自 E-khool 学习管理系统 (LMS) 的数据和来自 UCI 存储库的三个医学数据集完成的。使用聚类精度 (CA) 和骰子系数 (DC) 等指标执行分析。从仿真结果来看,很明显,所提出的并行聚类方案提供了比现有算法更好的结果,CA 和 DC 的值分别为 0.96 和 0.9667。

更新日期:2023-03-01
down
wechat
bug