当前位置: X-MOL 学术J. Intell. Manuf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cross-domain fusion and embedded refinement-based 6D object pose tracking on textureless objects
Journal of Intelligent Manufacturing ( IF 8.3 ) Pub Date : 2024-02-08 , DOI: 10.1007/s10845-023-02316-9
Jichun Wang , Guifang Duan , Yang Wang , Guodong Yi , Liangyu Dong , Zili Wang , Xuewei Zhang , Shuyou Zhang

In industrial production, the ability to accurately perceive the location and orientation information of target objects enables the generalization of certain production processes to unstructured scenarios, thereby facilitating intelligent manufacturing. 6D object pose tracking aims to achieve real-time, accurate and long-term pose estimation given a video sequence. In this paper, we introduce a novel RGB-based 6D object pose tracking method that leverages temporal information. Our approach mainly involves building a network to predict the pose residual between two consecutive image frames. Given industrial objects with weak textures and complex shapes, we incorporate a cross-domain attention fusion module during the feature fusion phase, enabling the capture of pixel-level correspondences between different feature representations. This module enhances robustness to illumination variations and occlusion challenges. Additionally, we propose a simple yet effective pose regression module, referred to as the embedded refinement module, which considers the deviation of previous pose estimations. This module mitigates the cumulative pose estimation deviation due to large movements to some extent. We conduct comparative experiments on the YCB dataset, Fast-YCB dataset and a customized dataset specifically designed for the manipulation of industrial parts by a robotic arm. The results demonstrate that our proposed method surpasses state-of-the-art techniques, achieving robust and long-term tracking capabilities.



中文翻译:

无纹理物体上的跨域融合和基于嵌入式细化的 6D 物体姿态跟踪

在工业生产中,准确感知目标物体的位置和方向信息的能力使得某些生产过程能够泛化到非结构化场景,从而促进智能制造。 6D 物体姿态跟踪旨在在给定视频序列的情况下实现实时、准确和长期的姿态估计。在本文中,我们介绍了一种利用时间信息的新颖的基于 RGB 的 6D 对象姿态跟踪方法。我们的方法主要涉及构建一个网络来预测两个连续图像帧之间的姿态残差。考虑到具有弱纹理和复杂形状的工业物体,我们在特征融合阶段结合了跨域注意力融合模块,从而能够捕获不同特征表示之间的像素级对应关系。该模块增强了对照明变化和遮挡挑战的鲁棒性。此外,我们提出了一个简单而有效的姿态回归模块,称为嵌入式细化模块,它考虑了先前姿态估计的偏差。该模块在一定程度上减轻了由于大运动造成的累积位姿估计偏差。我们对 YCB 数据集、Fast-YCB 数据集和专门为机械臂操作工业零件而设计的定制数据集进行了比较实验。结果表明,我们提出的方法超越了最先进的技术,实现了稳健和长期的跟踪能力。

更新日期:2024-02-08
down
wechat
bug