A Safe Preference Learning Approach for Personalization With Applications to Autonomous Vehicles,IEEE Robotics and Automation Letters

当前位置： X-MOL 学术 › IEEE Robot. Automation Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Safe Preference Learning Approach for Personalization With Applications to Autonomous Vehicles
IEEE Robotics and Automation Letters ( IF 5.2 ) Pub Date : 2024-03-11 , DOI: 10.1109/lra.2024.3375626
Ruya Karagulle ₁ , Nikos Aréchiga ₂ , Andrew Best ₂ , Jonathan DeCastro ₂ , Necmiye Ozay ₁

Affiliation

This letter introduces a preference learning method that ensures adherence to given specifications, with an application to autonomous vehicles. Our approach incorporates the priority ordering of Signal Temporal Logic (STL) formulas describing traffic rules into a learning framework. By leveraging Parametric Weighted Signal Temporal Logic (PWSTL), we formulate the problem of safety-guaranteed preference learning based on pairwise comparisons and propose an approach to solve this learning problem. Our approach finds a feasible valuation for the weights of the given PWSTL formula such that, with these weights, preferred signals have weighted quantitative satisfaction measures greater than their non-preferred counterparts. The feasible valuation of weights given by our approach leads to a weighted STL formula that can be used in correct-and-custom-by-construction controller synthesis. We demonstrate the performance of our method with a pilot human subject study in two different simulated driving scenarios involving a stop sign and a pedestrian crossing. Our approach yields competitive results compared to existing preference learning methods in terms of capturing preferences and notably outperforms them when safety is considered.

中文翻译：

一种安全的偏好学习方法，用于个性化并应用于自动驾驶汽车

这封信介绍了一种偏好学习方法，可确保遵守给定的规范，并应用于自动驾驶汽车。我们的方法将描述交通规则的信号时序逻辑（STL）公式的优先级排序合并到学习框架中。通过利用参数加权信号时间逻辑（PWSTL），我们提出了基于成对比较的安全保证偏好学习问题，并提出了一种解决该学习问题的方法。我们的方法为给定的 PWSTL 公式的权重找到了一个可行的估值，这样，通过这些权重，首选信号的加权定量满意度测量值大于其非首选信号。我们的方法给出的权重的可行评估得出了一个加权 STL 公式，该公式可用于正确和定制的控制器综合。我们通过在两个不同的模拟驾驶场景（涉及停车标志和人行横道）中进行试点人体研究来展示我们的方法的性能。与现有的偏好学习方法相比，我们的方法在捕获偏好方面产生了有竞争力的结果，并且在考虑安全性时明显优于它们。

更新日期：2024-03-11

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>