Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and Opportunities,ACM Computing Surveys

当前位置： X-MOL 学术 › ACM Comput. Surv. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and Opportunities
ACM Computing Surveys ( IF 16.6 ) Pub Date : 2024-04-24 , DOI: 10.1145/3648472
Jasmina Gajcin ₁ , Ivana Dusparic ₁

Affiliation

While AI algorithms have shown remarkable success in various fields, their lack of transparency hinders their application to real-life tasks. Although explanations targeted at non-experts are necessary for user trust and human-AI collaboration, the majority of explanation methods for AI are focused on developers and expert users. Counterfactual explanations are local explanations that offer users advice on what can be changed in the input for the output of the black-box model to change. Counterfactuals are user-friendly and provide actionable advice for achieving the desired output from the AI system. While extensively researched in supervised learning, there are few methods applying them to reinforcement learning (RL). In this work, we explore the reasons for the underrepresentation of a powerful explanation method in RL. We start by reviewing the current work in counterfactual explanations in supervised learning. Additionally, we explore the differences between counterfactual explanations in supervised learning and RL and identify the main challenges that prevent the adoption of methods from supervised in reinforcement learning. Finally, we redefine counterfactuals for RL and propose research directions for implementing counterfactuals in RL.

中文翻译：

重新定义强化学习的反事实解释：概述、挑战和机遇

尽管人工智能算法在各个领域都取得了显着的成功，但其缺乏透明度阻碍了其在现实生活中的应用。尽管针对非专家的解释对于用户信任和人类与人工智能的协作是必要的，但大多数人工智能的解释方法都集中在开发人员和专家用户身上。反事实解释是局部解释，为用户提供有关可以更改输入中的内容以更改黑盒模型输出的建议。反事实是用户友好的，并为实现人工智能系统所需的输出提供可行的建议。尽管对监督学习进行了广泛的研究，但将其应用于强化学习（RL）的方法却很少。在这项工作中，我们探讨了强化学习中强大的解释方法代表性不足的原因。我们首先回顾监督学习中反事实解释的当前工作。此外，我们还探讨了监督学习和强化学习中反事实解释之间的差异，并确定了阻碍在强化学习中采用监督方法的主要挑战。最后，我们重新定义了强化学习的反事实，并提出了在强化学习中实施反事实的研究方向。

更新日期：2024-04-24

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>