当前位置: X-MOL 学术IEEE Robot. Automation Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Continuous Object State Recognition for Cooking Robots Using Pre-Trained Vision-Language Models and Black-Box Optimization
IEEE Robotics and Automation Letters ( IF 5.2 ) Pub Date : 2024-03-11 , DOI: 10.1109/lra.2024.3375257
Kento Kawaharazuka 1 , Naoaki Kanazawa 1 , Yoshiki Obinata 1 , Kei Okada 1 , Masayuki Inaba 1
Affiliation  

The state recognition of the environment and objects by robots is generally based on the judgement of the current state as a classification problem. On the other hand, state changes of food in cooking happen continuously and need to be captured not only at a certain time point but also continuously over time. In addition, the state changes of food are complex and cannot be easily described by manual programming. Therefore, we propose a method to recognize the continuous state changes of food for cooking robots through the spoken language using pre-trained large-scale vision-language models. By using models that can compute the similarity between images and texts continuously over time, we can capture the state changes of food while cooking. We also show that by adjusting the weighting of each text prompt based on fitting the similarity changes to a sigmoid function and then performing black-box optimization, more accurate and robust continuous state recognition can be achieved. We demonstrate the effectiveness and limitations of this method by performing the recognition of water boiling, butter melting, egg cooking, and onion stir-frying.

中文翻译:

使用预先训练的视觉语言模型和黑盒优化对烹饪机器人进行连续物体状态识别

机器人对环境和物体的状态识别一般是以对当前状态的判断为分类问题。另一方面,烹饪过程中食物的状态变化是连续发生的,不仅需要在某个时间点捕获,还需要随着时间的推移连续捕获。另外,食物的状态变化比较复杂,无法通过手动编程轻松描述。因此,我们提出了一种使用预先训练的大规模视觉语言模型通过口语来识别烹饪机器人食物的连续状态变化的方法。通过使用可以随时间连续计算图像和文本之间相似度的模型,我们可以捕获食物在烹饪时的状态变化。我们还表明,通过将相似度变化拟合到 sigmoid 函数来调整每个文本提示的权重,然后进行黑盒优化,可以实现更准确和鲁棒的连续状态识别。我们通过对水沸腾、黄油融化、煮鸡蛋和洋葱炒的识别来证明该方法的有效性和局限性。
更新日期:2024-03-11
down
wechat
bug