当前位置: X-MOL 学术IEEJ Trans. Electr. Electron. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Class-Relation Reasoning with Knowledge-Transfer for Few-Shot Object Detection
IEEJ Transactions on Electrical and Electronic Engineering ( IF 1 ) Pub Date : 2024-03-04 , DOI: 10.1002/tee.24037
Xin Feng 1 , Zhixian Zhang 1 , Junjie Wang 1 , Siping Wang 1 , Xiaoning Jiao 1
Affiliation  

Few-Shot Object Detection (FSOD) task involves accurately identifying target object classes using only a small set of labeled samples. Most of the current FSOD tasks independently predict class prototype features without considering class relationships and only rely on visual information. To address these challenges, we propose a novel Class-relational Reasoning Method with Knowledge-transfer (CRK-Net), built on the meta-learning-based framework. Although data may be scarce, the semantic relationship between classes is invariant, Joint-feature Fusion Module (JFM) are hence proposed to transfers the semantic information of different categories in the natural language world to integrate with visual information and produce multi-modality embeddings. Some base classes and novel classes have similar features, so this can be borrowed by modeling the relationship between classes feature. Building upon the observation, we propose a Class-relational Reasoning Module (CRM) to establish the correlations between categories and enhance prototype representations for each category. After passing through the JFM and CRM modules, a high-quality class prototype is finally produced for subsequent regression and classification. Extensive experiments on PASCAL VOC demonstrate the effectiveness of our proposed method and provide a new scheme for fusing semantic and visual information. © 2024 Institute of Electrical Engineer of Japan and Wiley Periodicals LLC.

中文翻译:

用于少样本目标检测的知识转移的类关系推理

少样本目标检测 (FSOD) 任务涉及仅使用一小组标记样本来准确识别目标对象类别。当前大多数 FSOD 任务独立预测类原型特征,不考虑类关系,仅依赖于视觉信息。为了应对这些挑战,我们提出了一种基于元学习的框架构建的新型知识转移类关系推理方法(CRK-Net)。尽管数据可能稀缺,但类之间的语义关系是不变的,因此提出了联合特征融合模块(JFM)来传递自然语言世界中不同类别的语义信息,以与视觉信息集成并产生多模态嵌入。一些基类和新类具有相似的特征,因此可以通过对类特征之间的关系进行建模来借用这一点。基于观察,我们提出了一个类关系推理模块(CRM)来建立类别之间的相关性并增强每个类别的原型表示。经过JFM和CRM模块后,最终产生高质量的类原型,用于后续的回归和分类。PASCAL VOC 上的大量实验证明了我们提出的方法的有效性,并为融合语义和视觉信息提供了一种新方案。© 2024 日本电气工程师协会和 Wiley periodicals LLC。
更新日期:2024-03-04
down
wechat
bug