当前位置: X-MOL 学术ACM Trans. Auton. Adapt. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Faster MIL-based Subgoal Identification for Reinforcement Learning by Tuning Fewer Hyperparameters
ACM Transactions on Autonomous and Adaptive Systems ( IF 2.7 ) Pub Date : 2024-04-20 , DOI: 10.1145/3643852
Saim Sunel 1 , Erkin Çilden 2 , Faruk Polat 1
Affiliation  

Various methods have been proposed in the literature for identifying subgoals in discrete reinforcement learning (RL) tasks. Once subgoals are discovered, task decomposition methods can be employed to improve the learning performance of agents. In this study, we classify prominent subgoal identification methods for discrete RL tasks in the literature into the following three categories: graph-based, statistics-based, and multi-instance learning (MIL)-based. As contributions, first, we introduce a new MIL-based subgoal identification algorithm called EMDD-RL and experimentally compare it with a previous MIL-based method. The previous approach adapts MIL’s Diverse Density (DD) algorithm, whereas our method considers Expected-Maximization Diverse Density (EMDD). The advantage of EMDD over DD is that it can yield more accurate results with less computation demand thanks to the expectation-maximization algorithm. EMDD-RL modifies some of the algorithmic steps of EMDD to identify subgoals in discrete RL problems. Second, we evaluate the methods in several RL tasks for the hyperparameter tuning overhead they incur. Third, we propose a new RL problem called key-room and compare the methods for their subgoal identification performances in this new task. Experiment results show that MIL-based subgoal identification methods could be preferred to the algorithms of the other two categories in practice.



中文翻译:

通过调整更少的超参数,更快地识别基于 MIL 的强化学习子目标

文献中提出了各种方法来识别离散强化学习(RL)任务中的子目标。一旦发现子目标,就可以采用任务分解方法来提高智能体的学习性能。在本研究中,我们将文献中离散强化学习任务的主要子目标识别方法分为以下三类:基于图、基于统计和基于多实例学习(MIL)。作为贡献,首先,我们引入了一种新的基于 MIL 的子目标识别算法,称为 EMDD-RL,并通过实验将其与以前的基于 MIL 的方法进行比较。之前的方法采用了 MIL 的多样化密度 (DD) 算法,而我们的方法则考虑了预期最大化多样化密度 (EMDD)。 EMDD 相对于 DD 的优势在于,由于期望最大化算法,它可以用更少的计算需求产生更准确的结果。 EMDD-RL 修改了 EMDD 的一些算法步骤,以识别离散 RL 问题中的子目标。其次,我们评估了几个 RL 任务中的方法所产生的超参数调整开销。第三,我们提出了一个名为 key-room 的新 RL 问题,并比较了该新任务中各方法的子目标识别性能。实验结果表明,在实践中,基于MIL的子目标识别方法可以优于其他两类算法。

更新日期:2024-04-20
down
wechat
bug