T2FM: A novel hashtable based type-2 fuzzy frequent itemsets mining,Journal of Intelligent & Fuzzy Systems

当前位置： X-MOL 学术 › J. Intell. Fuzzy Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

T2FM: A novel hashtable based type-2 fuzzy frequent itemsets mining
Journal of Intelligent & Fuzzy Systems ( IF 2 ) Pub Date : 2024-02-14 , DOI: 10.3233/jifs-232918
M. Jeya Sutha ₁ , F. Ramesh Dhanaseelan ₁ , M. Felix Nes Mabel ₂ , V.T. Vijumon ₃

Affiliation

Association rule mining (ARM) is an important research issue in the field of data mining that aims to find relations among different items in binary databases. The conventional ARM algorithms consider the frequency of the items in binary databases, which is not sufficient for real time applications. In this paper, a novel hash table based Type-2 fuzzy mining algorithm (T2FM) with an efficient pruning strategy is presented for discovering multiple fuzzy frequent itemsets from quantitative databases. The algorithm employs a hash table based structure for efficient storage and retrieval of item/itemset which reduces the search efficiency to O(1) or constant time. Previously, type-2 based Apriori and FP-growth based fuzzy frequent itemsets mining were proposed, which required large amounts of computation and a greater number of candidate generation and processing. Meanwhile, the proposed approach reduces a huge amount of computation by finding the common keys before the actual intersection operation takes place. An efficient pruning strategy is proposed to avoid unpromising candidates in order to speed up the computations. Several experiments are carried out to verify the efficiency of the approach in terms of runtime and memory for different minimum support threshold and the results show that the designed approach provides better performance compared to the state-of-the-art algorithms.

中文翻译：

T2FM：一种新颖的基于哈希表的2型模糊频繁项集挖掘

关联规则挖掘（ARM）是数据挖掘领域的一个重要研究问题，旨在发现二进制数据库中不同项目之间的关系。传统的ARM算法考虑了二进制数据库中项目的频率，这对于实时应用来说是不够的。本文提出了一种基于哈希表的新型 2 型模糊挖掘算法（T2FM），该算法具有高效的剪枝策略，用于从定量数据库中发现多个模糊频繁项集。该算法采用基于哈希表的结构来有效存储和检索项目/项目集，从而将搜索效率降低到 O(1) 或恒定时间。此前提出了基于Type-2的Apriori和基于FP-growth的模糊频繁项集挖掘，需要大量的计算和大量的候选生成和处理。同时，所提出的方法通过在实际相交操作发生之前查找公共密钥来减少大量计算量。提出了一种有效的修剪策略来避免无前途的候选者，从而加快计算速度。进行了多次实验来验证该方法在不同最小支持阈值的运行时间和内存方面的效率，结果表明，与最先进的算法相比，所设计的方法提供了更好的性能。

更新日期：2024-02-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>