当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Transferable dynamics models for efficient object-oriented reinforcement learning
Artificial Intelligence ( IF 14.4 ) Pub Date : 2024-01-26 , DOI: 10.1016/j.artint.2024.104079
Ofir Marom , Benjamin Rosman

The Reinforcement Learning (RL) framework offers a general paradigm for constructing autonomous agents that can make effective decisions when solving tasks. An important area of study within the field of RL is transfer learning, where an agent utilizes knowledge gained from solving previous tasks to solve a new task more efficiently. While the notion of transfer learning is conceptually appealing, in practice, not all RL representations are amenable to transfer learning. Moreover, much of the research on transfer learning in RL is purely empirical. Previous research has shown that object-oriented representations are suitable for the purposes of transfer learning with theoretical efficiency guarantees. Such representations leverage the notion of object classes to learn lifted rules that apply to grounded object instantiations. In this paper, we extend previous research on object-oriented representations and introduce two formalisms: the first is based on deictic predicates, and is used to learn a transferable transition dynamics model; the second is based on propositions, and is used to learn a transferable reward dynamics model. In addition, we extend previously introduced efficient learning algorithms for object-oriented representations to our proposed formalisms. Our frameworks are then combined into a single efficient algorithm that learns transferable transition and reward dynamics models across a domain of related tasks. We illustrate our proposed algorithm empirically on an extended version of the Taxi domain, as well as the more difficult Sokoban domain, showing the benefits of our approach with regards to efficient learning and transfer.



中文翻译:

用于高效面向对象强化学习的可转移动力学模型

强化学习(RL)框架提供了构建自主代理的通用范例,可以在解决任务时做出有效的决策。强化学习领域的一个重要研究领域是迁移学习,其中代理利用从解决先前任务中获得的知识来更有效地解决新任务。虽然迁移学习的概念在概念上很有吸引力,但在实践中,并非所有强化学习表示都适合迁移学习。此外,强化学习中迁移学习的大部分研究都是纯粹的实证研究。先前的研究表明,面向对象的表示适合于具有理论效率保证的迁移学习目的。这种表示利用对象类的概念来学习适用于接地对象实例化的提升规则。在本文中,我们扩展了之前关于面向对象表示的研究,并引入了两种形式:第一种是基于指示谓词,用于学习可转移的过渡动力学模型;第二个基于命题,用于学习可转移奖励动态模型。此外,我们将之前介绍的面向对象表示的有效学习算法扩展到我们提出的形式主义。然后,我们的框架被组合成一个有效的算法,该算法可以学习跨相关任务领域的可转移转换和奖励动态模型。我们在 Taxi 域的扩展版本以及更困难的 Sokoban 域上凭经验说明了我们提出的算法,展示了我们的方法在高效学习和迁移方面的优势。

更新日期:2024-01-31
down
wechat
bug