当前位置: X-MOL 学术Rob. Auton. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
iCORPP: Interleaved commonsense reasoning and probabilistic planning on robots
Robotics and Autonomous Systems ( IF 4.3 ) Pub Date : 2024-01-10 , DOI: 10.1016/j.robot.2023.104613
Shiqi Zhang , Piyush Khandelwal , Peter Stone

Robot sequential decision-making in the real world is a challenge because it requires the robots to simultaneously reason about the current world state and dynamics, while planning actions to accomplish complex tasks. On the one hand, declarative languages and reasoning algorithms support representing and reasoning with commonsense knowledge. But these algorithms are not good at planning actions toward maximizing cumulative reward over a long, unspecified horizon. On the other hand, probabilistic planning frameworks, such as Markov decision processes (MDPs) and partially observable MDPs (POMDPs), support planning to achieve long-term goals under uncertainty. But they are ill-equipped to represent or reason about knowledge that is not directly related to actions. In this article, we present an algorithm, called iCORPP, to simultaneously estimate the current world state, reason about world dynamics, and construct task-oriented controllers. In this process, robot decision-making problems are decomposed into two interdependent (smaller) subproblems that focus on reasoning to “understand the world” and planning to “achieve the goal” respectively. The developed algorithm has been implemented and evaluated both in simulation and on real robots using everyday service tasks, such as indoor navigation, and dialog management. Results show significant improvements in scalability, efficiency, and adaptiveness, compared to competitive baselines including handcrafted action policies.



中文翻译:

iCORPP:机器人的交错常识推理和概率规划

现实世界中的机器人顺序决策是一个挑战,因为它要求机器人同时推理当前的世界状态和动态,同时规划行动以完成复杂的任务。一方面,声明性语言和推理算法支持用常识知识进行表示和推理。但这些算法不擅长在长期、未指定的范围内规划行动以最大化累积奖励。另一方面,概率规划框架,例如马尔可夫决策过程(MDP)和部分可观察的MDP(POMDP),支持在不确定性下实现长期目标的规划。但它们不具备表达或推理与行为不直接相关的知识的能力。在本文中,我们提出了一种名为 iCORPP 的算法,用于同时估计当前世界状态、推理世界动态并构建面向任务的控制器。在这个过程中,机器人决策问题被分解为两个相互依赖(较小)的子问题,分别侧重于推理“理解世界”和规划“实现目标”。开发的算法已在模拟和使用日常服务任务(例如室内导航和对话管理)的真实机器人上进行了实施和评估。结果显示,与包括手工制定的行动策略在内的竞争基准相比,可扩展性、效率和适应性有了显着改善。

更新日期:2024-01-10
down
wechat
bug