当前位置: X-MOL 学术ACM Trans. Knowl. Discov. Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CoBjeason: Reasoning Covered Object in Image by Multi-Agent Collaboration Based on Informed Knowledge Graph
ACM Transactions on Knowledge Discovery from Data ( IF 3.6 ) Pub Date : 2024-02-28 , DOI: 10.1145/3643565
Huan Rong 1 , Minfeng Qian 1 , Tinghuai Ma 2 , Di Jin 3 , Victor S. Sheng 4
Affiliation  

Object detection is a widely studied problem in existing works. However, in this paper, we turn to a more challenging problem of “Covered Object Reasoning”, aimed at reasoning the category label of target object in the given image particularly when it has been totally covered (or invisible). To resolve this problem, we propose CoBjeason to seize the opportunity when visual reasoning meets the knowledge graph, where “empirical cognition” on common visual contexts have been incorporated as knowledge graph to conduct reinforced multi-hop reasoning via two collaborative agents. Such two agents, for one thing, stand at the covered object (or unknown entity) to observe the surrounding visual cues in the given image and gradually select entities and relations from the global gallery-level knowledge graph which contains entity-pairs frequently occurring across the entire image-collection, so as to infer the main structure of image-level knowledge graph forward expanded from the unknown entity. In turn, for another, based on the reasoned image-level knowledge graph, the semantic context among entities will be aggregated backward into unknown entity to select an appropriate entity from the global gallery-level knowledge graph as the reasoning result. Moreover, such two agents will collaborate with each other, securing that the above Forward & Backward Reasoning will step towards the same destination of the higher performance on covered object reasoning. To our best knowledge, this is the first work on Covered Object Reasoning with Knowledge Graphs and reinforced Multi-Agent collaboration. Particularly, our study on Covered Object Reasoning and the proposed model CoBjeason could offer novel insights into more basic Computer Vision (CV) tasks, such as Semantic Segmentation with better understanding on the current scene when some objects are blurred or covered, Visual Question Answering with enhancement on the inference in more complicated visual context when some objects are covered or invisible, and Image Caption Generation with the augmentation on the richness of visual context for images containing partially visible objects. The improvement on the above basic CV tasks can further refine more complicated ones involved with nuanced visual interpretation like Autonomous Driving, where the recognition and reasoning on partially visible or covered object are critical. According to the experimental results, our proposed CoBjeason can achieve the best overall ranking performance on covered object reasoning compared with other models, meanwhile enjoying the advantage of lower “exploration cost”, with the insensitivity against the long-tail covered objects and the acceptable time complexity.



中文翻译:

CoBjeason:基于知情知识图谱的多智能体协作推理图像中覆盖对象

目标检测是现有工作中广泛研究的问题。然而,在本文中,我们转向更具挑战性的“覆盖对象推理”问题,旨在推理给定图像中目标对象的类别标签,特别是当它被完全覆盖(或不可见)时。为了解决这个问题,我们提出CoBjeason抓住视觉推理遇到知识图谱的机会,将常见视觉上下文的“经验认知”纳入知识图谱,通过两个协作代理进行强化多跳推理。一方面,这样的两个代理站在被覆盖的对象(或未知实体)处观察给定图像中周围的视觉线索,并从全局图库级知识图中逐渐选择实体关系,该全局图库级知识图包含频繁出现的实体对整个图像集合,从而推断出未知实体向前扩展的图像级知识图谱的主要结构。另一方面,基于推理出的图像级知识图谱,将实体之间的语义上下文向后聚合到未知实体中,以从全局图库级知识图谱中选择合适的实体作为推理结果。此外,这两个智能体将相互协作,确保上述前向后向推理将朝着更高性能的覆盖对象推理的相同目标迈进。据我们所知,这是第一个关于使用知识图进行覆盖对象推理并加强多智能体协作的工作。特别是,我们对覆盖对象推理的研究和提出的模型CoBjeason可以为更基本的计算机视觉(CV)任务提供新颖的见解,例如当某些对象模糊或被覆盖时可以更好地理解当前场景的语义分割、视觉问答当某些对象被覆盖或不可见时,增强更复杂的视觉上下文中的推理,以及图像标题生成增强包含部分可见对象的图像的视觉上下文的丰富性。对上述基本 CV 任务的改进可以进一步细化涉及细致入微的视觉解释的更复杂的任务,例如自动驾驶,其中对部分可见或被覆盖的物体的识别和推理至关重要。根据实验结果,与其他模型相比,我们提出的CoBjeason在覆盖对象推理上可以实现最佳的总体排名性能,同时具有较低“探索成本”的优势,对长尾覆盖对象不敏感且可接受的时间复杂。

更新日期:2024-03-01
down
wechat
bug