TIE: Fast Experiment-Driven ML-Based Configuration Tuning for In-Memory Data Analytics,IEEE Transactions on Computers

当前位置： X-MOL 学术 › IEEE Trans. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

TIE: Fast Experiment-Driven ML-Based Configuration Tuning for In-Memory Data Analytics
IEEE Transactions on Computers ( IF 3.7 ) Pub Date : 2024-02-14 , DOI: 10.1109/tc.2024.3365937
Chao Chen ₁ , Jinhan Xin ₂ , Zhibin Yu ₁

Affiliation

Recently, experiment-driven machine-learning (ML) based configuration tuning for in-memory data analytics such as Apache Spark become popular because they can achieve high speedups. However, experiment-driven ML-based approaches naturally need a large number of iterations and each iteration generates a configuration with a probabilistic strategy and executes the program on a real cluster with the configuration. It therefore takes a long time to optimize the performance of an in-memory data analytics program, and thereby hinders these approaches from being widely used in practice. To address this issue, we propose a novel as well as simple approach dubbed Terminating-It-Early (TIE) to reduce the time needed to perform the experiment executions but to achieve speedups similar to those obtained by experiment-driven ML-based approaches. The key idea is that, during the process of searching for the optimal configuration which produces the shortest execution time for a program, we terminate an experiment program execution with a trial configuration as soon as possible when we find its execution time is longer than a predefined threshold (e.g., the shortest execution time thus far). In contrast, traditional experiment-driven ML-based approaches always run all experiment executions completely. We employ 19 Apache Spark programs running on a physical cluster as well as a virtual cluster to evaluate TIE. We compare the tuning time used to find the optimal configuration of a program and the optimized execution time of a program obtained by TIE against those obtained by CherryPick and a reinforcement learning (RL) based approach. The experimental results show that on physical machines, TIE reduces the tuning time used by CherryPick and the RL-based approach by factors of

$2.39\times$

and

$1.68\times$

on average, respectively. On virtual machines, the corresponding factors are

$2.79\times$

and

$1.71\times$

. Moreover, the average optimized execution time of the 19 programs tuned by TIE is slightly shorter than those tuned by CherryPick and the RL-based approach.

中文翻译：

TIE：基于实验驱动的基于 ML 的快速配置调整，用于内存数据分析

最近，基于实验驱动的机器学习 (ML) 的内存数据分析配置调整（例如 Apache Spark）变得流行，因为它们可以实现高加速。然而，基于实验驱动的机器学习方法自然需要大量迭代，每次迭代都会生成具有概率策略的配置，并使用该配置在真实集群上执行程序。因此，内存数据分析程序的性能优化需要很长时间，从而阻碍了这些方法在实践中的广泛应用。为了解决这个问题，我们提出了一种新颖且简单的方法，称为提前终止 (TIE)，以减少执行实验所需的时间，同时实现与基于实验驱动的基于 ML 的方法所获得的加速类似的加速。关键思想是，在寻找程序执行时间最短的最优配置的过程中，当发现实验程序的执行时间长于预定义的配置时，我们尽快终止该实验程序的执行。阈值（例如，迄今为止最短的执行时间）。相比之下，传统的实验驱动的基于 ML 的方法始终完整地运行所有实验执行。我们使用在物理集群和虚拟集群上运行的 19 个 Apache Spark 程序来评估 TIE。我们将TIE 获得的用于寻找程序最佳配置的调整时间以及程序的优化执行时间与CherryPick和基于强化学习 (RL) 的方法获得的时间进行比较。实验结果表明，在物理机上，TIE 将CherryPick和基于 RL 的方法使用的调优时间减少了以下因素：

$2.39\次$

和

$1.68\次$

分别平均。在虚拟机上，相应的因素是

$2.79\次$

和

$1.71\次$

。此外，TIE 调优的 19 个程序的平均优化执行时间略短于CherryPick和基于 RL 的方法调优的程序。

更新日期：2024-02-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南