当前位置: X-MOL 学术arXiv.cs.RO › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
OVAL-Prompt: Open-Vocabulary Affordance Localization for Robot Manipulation through LLM Affordance-Grounding
arXiv - CS - Robotics Pub Date : 2024-04-17 , DOI: arxiv-2404.11000
Edmond Tong, Anthony Opipari, Stanley Lewis, Zhen Zeng, Odest Chadwicke Jenkins

In order for robots to interact with objects effectively, they must understand the form and function of each object they encounter. Essentially, robots need to understand which actions each object affords, and where those affordances can be acted on. Robots are ultimately expected to operate in unstructured human environments, where the set of objects and affordances is not known to the robot before deployment (i.e. the open-vocabulary setting). In this work, we introduce OVAL-Prompt, a prompt-based approach for open-vocabulary affordance localization in RGB-D images. By leveraging a Vision Language Model (VLM) for open-vocabulary object part segmentation and a Large Language Model (LLM) to ground each part-segment-affordance, OVAL-Prompt demonstrates generalizability to novel object instances, categories, and affordances without domain-specific finetuning. Quantitative experiments demonstrate that without any finetuning, OVAL-Prompt achieves localization accuracy that is competitive with supervised baseline models. Moreover, qualitative experiments show that OVAL-Prompt enables affordance-based robot manipulation of open-vocabulary object instances and categories.

中文翻译:

OVAL-Prompt:通过法学硕士可供性接地实现机器人操作的开放词汇可供性本地化

为了使机器人能够有效地与物体交互,它们必须了解所遇到的每个物体的形式和功能。从本质上讲,机器人需要了解每个对象提供哪些操作,以及可以在哪里执行这些操作。机器人最终有望在非结构化的人类环境中运行,其中机器人在部署之前不知道对象和可供性的集合(即开放词汇设置)。在这项工作中,我们介绍了 OVAL-Prompt,这是一种基于提示的方法,用于 RGB-D 图像中开放词汇可供性本地化。通过利用视觉语言模型 (VLM) 进行开放词汇对象部分分割,并利用大型语言模型 (LLM) 来为每个部分片段功能提供基础,OVAL-Prompt 展示了对新对象实例、类别和功能可见性的通用性,而无需域-具体微调。定量实验表明,无需任何微调,OVAL-Prompt 即可实现与监督基线模型相媲美的定位精度。此外,定性实验表明,OVAL-Prompt 能够实现基于可供性的机器人操作开放词汇对象实例和类别。
更新日期:2024-04-18
down
wechat
bug