FSODv2: A Deep Calibrated Few-Shot Object Detection Network,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

FSODv2: A Deep Calibrated Few-Shot Object Detection Network
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2024-04-04 , DOI: 10.1007/s11263-024-02049-z
Qi Fan , Wei Zhuo , Chi-Keung Tang , Yu-Wing Tai

Traditional methods for object detection typically necessitate a substantial amount of training data, and creating high-quality training data is time-consuming. We propose a novel Few-Shot Object Detection network (FSODv2) in this paper that aims to detect objects from previously unseen categories using only a few annotated examples. Attention RPN, Multi-Relation Detector, and Contrastive Training strategy are central to our method (Fan et al., in: CVPR, 2020), which exploit similarity between few shot support set and query set to detect novel objects while suppressing false detection in the background. We also contribute a new dataset, FSOD-1k, which contains 1000 categories of various objects with high-quality annotations to train our network. To the best of our knowledge, this is one of the first datasets designed for few-shot object detection. This paper improves our FSOD model through well-designed model calibration in three areas: (1) we propose an improved FPN with multi-scale support inputs to calibrate the multi-scale support-query feature matching by exploiting multi-scale features from the same support image with different input scales; (2) we introduce a support classification supervision branch to calibrate the support feature supervision, aligning to the query feature training supervision; (3) we propose backbone calibration to preserve prior knowledge while alleviating backbone bias toward base classes by employing classification dataset to help our model calibration procedure, where such dataset has previously only been used for pre-training in other related works. Besides, we propose a Fast Attention RPN to improve evaluation speed and save computational memory during inference. Once trained, our few-shot network can detect objects from previously unseen categories without further training or fine-tuning, resulting in new state-of-the-art performance on different datasets in the few-shot setting. Our method is general in scope and has numerous potential applications. The dataset link is https://github.com/fanq15/Few-Shot-Object-Detection-Dataset.

中文翻译：

FSODv2：深度校准的少样本目标检测网络

传统的目标检测方法通常需要大量的训练数据，并且创建高质量的训练数据非常耗时。我们在本文中提出了一种新颖的少样本目标检测网络（FSODv2），旨在仅使用一些带注释的示例来检测以前未见过的类别中的对象。注意力 RPN、多关系检测器和对比训练策略是我们方法的核心（Fan 等人，in：CVPR，2020），它利用少数镜头支持集和查询集之间的相似性来检测新对象，同时抑制错误检测的背景。我们还贡献了一个新的数据集 FSOD-1k，其中包含 1000 个具有高质量注释的各种对象类别来训练我们的网络。据我们所知，这是为少镜头目标检测而设计的首批数据集之一。本文通过精心设计的模型校准在三个方面改进了我们的 FSOD 模型：（1）我们提出了一种具有多尺度支持输入的改进 FPN，通过利用来自同一模型的多尺度特征来校准多尺度支持-查询特征匹配。支持不同输入比例的图像；（2）我们引入支持分类监督分支来校准支持特征监督，与查询特征训练监督保持一致；（3）我们提出骨干校准来保留先验知识，同时通过使用分类数据集来帮助我们的模型校准过程，从而减轻骨干对基类的偏差，其中此类数据集以前仅用于其他相关工作中的预训练。此外，我们提出了快速注意力 RPN 来提高评估速度并在推理过程中节省计算内存。经过训练后，我们的少样本网络可以检测以前未见过的类别中的对象，无需进一步训练或微调，从而在少样本设置中的不同数据集上实现新的最先进的性能。我们的方法具有通用性，并且具有许多潜在的应用。数据集链接为https://github.com/fanq15/Few-Shot-Object-Detection-Dataset。

更新日期：2024-04-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>