Feature enhancement and coarse-to-fine detection for RGB-D tracking,Pattern Recognition Letters

当前位置： X-MOL 学术 › Pattern Recogn. Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Feature enhancement and coarse-to-fine detection for RGB-D tracking
Pattern Recognition Letters ( IF 5.1 ) Pub Date : 2024-02-10 , DOI: 10.1016/j.patrec.2024.02.007
Xue-Feng Zhu , Tianyang Xu , Xiao-Jun Wu , Josef Kittler

Existing RGB-D tracking algorithms advance the performance by constructing typical appearance models from the RGB-only tracking frameworks. There is no attempt to exploit any complementary visual information from the multi-modal input. This paper addresses this deficit and presents a novel algorithm to boost the performance of RGB-D tracking by taking advantage of collaborative clues. To guarantee input consistency, depth images are encoded into the three-channel HHA representation to create input of a similar structure to the RGB images, so that the deep CNN features can be extracted from both modalities. To highlight the discriminatory information in multi-modal features, a feature enhancement module using a cross-attention strategy is proposed. With the attention map produced by the proposed cross-attention method, the target area of the features can be enhanced and the negative influence of the background is suppressed. Besides, we address the potential tracking failure by introducing a long-term mechanism. The experimental results obtained on the well-known benchmarking datasets, including PTB, STC, and CTDB, demonstrate the superiority of the proposed RGB-D tracker. On PTB, the proposed method achieves the highest AUC scores against compared trackers across scenarios with five distinct challenging attributes. On STC and CDTB, our FECD obtains an overall AUC of 0.630 and an F-score of 0.630, respectively.

中文翻译：

RGB-D 跟踪的特征增强和从粗到精的检测

现有的 RGB-D 跟踪算法通过从纯 RGB 跟踪框架构建典型的外观模型来提高性能。没有尝试利用来自多模式输入的任何补充视觉信息。本文解决了这一缺陷，并提出了一种新颖的算法，通过利用协作线索来提高 RGB-D 跟踪的性能。为了保证输入的一致性，深度图像被编码到三通道 HHA 表示中，以创建与 RGB 图像结构相似的输入，以便可以从两种模态中提取深度 CNN 特征。为了突出多模态特征中的歧视性信息，提出了一种使用交叉注意策略的特征增强模块。利用所提出的交叉注意方法生成的注意图，可以增强特征的目标区域并抑制背景的负面影响。此外，我们通过引入长效机制来解决潜在的跟踪失败问题。在著名基准数据集（包括 PTB、STC 和 CTDB）上获得的实验结果证明了所提出的 RGB-D 跟踪器的优越性。在 PTB 上，所提出的方法在具有五个不同挑战性属性的场景中与比较跟踪器相比获得了最高的 AUC 分数。在 STC 和 CDTB 上，我们的 FECD 分别获得了 0.630 的总体 AUC 和 0.630 的 F 分数。

更新日期：2024-02-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>