当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Investigating the properties of neural network representations in reinforcement learning
Artificial Intelligence ( IF 14.4 ) Pub Date : 2024-03-01 , DOI: 10.1016/j.artint.2024.104100
Han Wang , Erfan Miahi , Martha White , Marlos C. Machado , Zaheer Abbas , Raksha Kumaraswamy , Vincent Liu , Adam White

In this paper we investigate the properties of representations learned by deep reinforcement learning systems. Much of the early work on representations for reinforcement learning focused on designing fixed-basis architectures to achieve properties thought to be desirable, such as orthogonality and sparsity. In contrast, the idea behind deep reinforcement learning methods is that the agent designer should not encode representational properties, but rather that the data stream should determine the properties of the representation—good representations emerge under appropriate training schemes. In this paper we bring these two perspectives together, empirically investigating the properties of representations that support transfer in reinforcement learning. We introduce and measure six representational properties over more than 25,000 agent-task settings. We consider Deep Q-learning agents with different auxiliary losses in a pixel-based navigation environment, with source and transfer tasks corresponding to different goal locations. We develop a method to better understand some representations work better for transfer, through a systematic approach varying task similarity and measuring and correlating representation properties with transfer performance. We demonstrate the generality of the methodology by investigating representations learned by a Rainbow agent that successfully transfers across Atari 2600 game modes.

中文翻译:

研究强化学习中神经网络表示的属性

在本文中,我们研究了深度强化学习系统学习到的表示的属性。关于强化学习表示的许多早期工作都集中在设计固定基础架构以实现被认为理想的属性,例如正交性和稀疏性。相比之下,深度强化学习方法背后的想法是,代理设计者不应该对表征属性进行编码,而应该由数据流来确定表征的属性——在适当的训练方案下会出现良好的表征。在本文中,我们将这两种观点结合在一起,实证研究支持强化学习中迁移的表示的属性。我们引入并测量了超过 25,000 个代理任务设置的六种代表性属性。我们考虑在基于像素的导航环境中具有不同辅助损失的深度 Q 学习代理,源任务和传输任务对应于不同的目标位置。我们开发了一种方法,通过改变任务相似性的系统方法以及测量表征属性和迁移性能并将其关联起来,更好地理解一些表征对于迁移来说效果更好。我们通过研究 Rainbow 代理学习到的表示来证明该方法的通用性,该代理成功地跨 Atari 2600 游戏模式进行迁移。
更新日期:2024-03-01
down
wechat
bug