当前位置: X-MOL 学术IEEE Robot. Automation Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Safety-Aware Causal Representation for Trustworthy Offline Reinforcement Learning in Autonomous Driving
IEEE Robotics and Automation Letters ( IF 5.2 ) Pub Date : 2024-03-20 , DOI: 10.1109/lra.2024.3379805
Haohong Lin 1 , Wenhao Ding 1 , Zuxin Liu 1 , Yaru Niu 1 , Jiacheng Zhu 1 , Yuming Niu 2 , Ding Zhao 1
Affiliation  

In the domain of autonomous driving, the offline Reinforcement Learning (RL) approaches exhibit notable efficacy in addressing sequential decision-making problems from offline datasets. However, maintaining safety in diverse safety-critical scenarios remains a significant challenge due to long-tailed and unforeseen scenarios absent from offline datasets. In this letter, we introduce the saFety-aware strUctured Scenario representatION (FUSION), a pioneering representation learning method in offline RL to facilitate the learning of a generalizable end-to-end driving policy by leveraging structured scenario information. FUSION capitalizes on the causal relationships between the decomposed reward, cost, state, and action space, constructing a framework for structured sequential reasoning in dynamic traffic environments. We conduct extensive evaluations in two typical real-world settings of the distribution shift in autonomous vehicles, demonstrating the good balance between safety cost and utility reward compared to the current state-of-the-art safe RL and IL baselines. Empirical evidence in various driving scenarios attests that FUSION significantly enhances the safety and generalizability of autonomous driving agents, even in the face of challenging and unseen environments. Furthermore, our ablation studies reveal noticeable improvements in the integration of causal representation into the offline safe RL algorithm. Our code implementation is available on the project website.

中文翻译:

自动驾驶中值得信赖的离线强化学习的安全意识因果表示

在自动驾驶领域,离线强化学习(RL)方法在解决离线数据集中的顺序决策问题方面表现出显着的功效。然而,由于离线数据集中不存在长尾和不可预见的场景,在各种安全关键场景中维护安全仍然是一项重大挑战。在这封信中,我们介绍了安全感知结构化场景表示(FUSION),这是离线强化学习中的一种开创性的表示学习方法,可通过利用结构化场景信息来促进可推广的端到端驾驶策略的学习。 FUSION 利用分解的奖励、成本、状态和动作空间之间的因果关系,构建了动态交通环境中结构化顺序推理的框架。我们在两种典型的现实环境中对自动驾驶车辆的分配转移进行了广泛的评估,证明了与当前最先进的安全 RL 和 IL 基线相比,安全成本和效用奖励之间的良好平衡。各种驾驶场景的经验证据证明,即使面对具有挑战性和看不见的环境,FUSION 也能显着增强自动驾驶代理的安全性和通用性。此外,我们的消融研究揭示了将因果表示集成到离线安全强化学习算法中的显着改进。我们的代码实现可以在项目网站上找到。
更新日期:2024-03-20
down
wechat
bug