Knowledge Distillation via Token-Level Relationship Graph Based on the Big Data Technologies,Big Data Research

当前位置： X-MOL 学术 › Big Data Res. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Knowledge Distillation via Token-Level Relationship Graph Based on the Big Data Technologies
Big Data Research ( IF 3.3 ) Pub Date : 2024-02-12 , DOI: 10.1016/j.bdr.2024.100438
Shuoxi Zhang , Hanpeng Liu , Kun He

In the big data era, characterized by vast volumes of complex data, the efficiency of machine learning models is of utmost importance, particularly in the context of intelligent agriculture. Knowledge distillation (KD), a technique aimed at both model compression and performance enhancement, serves as a pivotal solution by distilling the knowledge from an elaborate model (teacher) to a lightweight, compact counterpart (student). However, the true potential of KD has not been fully explored. Existing approaches primarily focus on transferring instance-level information by big data technologies, overlooking the valuable information embedded in token-level relationships, which may be particularly affected by the long-tail effects. To address the above limitations, we propose a novel method called Knowledge Distillation with Token-level Relationship Graph (TRG) that leverages token-wise relationships to enhance the performance of knowledge distillation. By employing TRG, the student model can effectively emulate higher-level semantic information from the teacher model, resulting in improved performance and mobile-friendly efficiency. To further enhance the learning process, we introduce a dynamic temperature adjustment strategy, which encourages the student model to capture the topology structure of the teacher model more effectively. We conduct experiments to evaluate the effectiveness of the proposed method against several state-of-the-art approaches. Empirical results demonstrate the superiority of TRG across various visual tasks, including those involving imbalanced data. Our method consistently outperforms the existing baselines, establishing a new state-of-the-art performance in the field of KD based on big data technologies.

中文翻译：

基于大数据技术的通证级关系图知识蒸馏

在以海量复杂数据为特征的大数据时代，机器学习模型的效率至关重要，特别是在智能农业的背景下。知识蒸馏 (KD) 是一种旨在模型压缩和性能增强的技术，通过将知识从复杂的模型（教师）蒸馏到轻量级、紧凑的对应模型（学生），成为关键的解决方案。然而，KD 的真正潜力尚未得到充分挖掘。现有的方法主要侧重于通过大数据技术传输实例级信息，而忽略了嵌入到代币级关系中的有价值的信息，这些信息可能特别受到长尾效应的影响。为了解决上述限制，我们提出了一种称为知识蒸馏与令牌级关系图（TRG）的新方法，它利用令牌级关系来增强知识蒸馏的性能。通过采用 TRG，学生模型可以有效地模拟教师模型的更高级别的语义信息，从而提高性能和移动友好的效率。为了进一步增强学习过程，我们引入了动态温度调整策略，该策略鼓励学生模型更有效地捕获教师模型的拓扑结构。我们进行实验来评估所提出的方法相对于几种最先进方法的有效性。实证结果证明了 TRG 在各种视觉任务中的优越性，包括那些涉及不平衡数据的任务。我们的方法始终优于现有基线，在基于大数据技术的 KD 领域建立了新的最先进性能。

更新日期：2024-02-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>