Learning to Detect Attended Objects in Cultural Sites with Gaze Signals and Weak Object Supervision,ACM Journal on Computing and Cultural Heritage

当前位置： X-MOL 学术 › ACM J. Comput. Cult. Herit. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning to Detect Attended Objects in Cultural Sites with Gaze Signals and Weak Object Supervision
ACM Journal on Computing and Cultural Heritage ( IF 2.4 ) Pub Date : 2024-04-23 , DOI: 10.1145/3647999
Michele Mazzamuto ₁ , Francesco Ragusa ₁ , Antonino Furnari ₁ , Giovanni Maria Farinella ₁

Affiliation

Cultural sites such as museums and monuments are popular tourist destinations worldwide. Visitors come to these places to learn about the cultures, histories, and arts of a particular region or country. However, for many cultural sites, traditional visiting approaches are limited and may fail to engage visitors. To enhance visitors’ experiences, previous works have explored how wearable devices can be exploited in this context. Among the many functions that these devices can offer, understanding which artwork or detail the user is attending to is fundamental to provide additional information on the observed artworks, understand the visitor’s tastes, and provide recommendations. This motivates the development of algorithms for understanding visitor attention from egocentric images. We considered the attended object detection task, which involves detecting and recognizing the object observed by the camera wearer, from an input RGB image and gaze signals. To study the problem, we collect a dataset of egocentric images collected by subjects visiting a museum. Since collecting and labeling data in cultural sites for real applications is a time-consuming problem, we present a study comparing unsupervised, weakly supervised, and fully supervised approaches for attended object detection. We evaluate the considered approaches on the collected dataset, assessing also the impact of training models on external datasets such as COCO and EGO-CH. The experiments show that weakly supervised approaches requiring only a 2D point label related to the gaze can be an effective alternative to fully supervised approaches for attended object detection. To encourage research on the topic, we publicly release the code and the dataset at the following url: https://iplab.dmi.unict.it/EGO-CH-Gaze/.

中文翻译：

学习利用注视信号和弱对象监督来检测文化场所中有人关注的对象

博物馆和纪念碑等文化遗址是世界各地受欢迎的旅游目的地。游客来到这些地方是为了了解特定地区或国家的文化、历史和艺术。然而，对于许多文化遗址来说，传统的参观方式是有限的，可能无法吸引游客。为了增强访客的体验，之前的工作已经探索了如何在这种情况下利用可穿戴设备。在这些设备可以提供的众多功能中，了解用户正在关注哪些艺术品或细节对于提供有关所观察艺术品的附加信息、了解访问者的品味并提供建议至关重要。这推动了算法的开发，以从以自我为中心的图像中了解访客的注意力。我们考虑了有人值守的对象检测任务，该任务涉及从输入的 RGB 图像和注视信号中检测和识别相机佩戴者观察到的对象。为了研究这个问题，我们收集了参观博物馆的受试者收集的以自我为中心的图像数据集。由于在实际应用中收集和标记文化场所的数据是一个耗时的问题，因此我们提出了一项研究，比较无监督、弱监督和完全监督的有人值守对象检测方法。我们在收集的数据集上评估所考虑的方法，还评估训练模型对 COCO 和 EGO-CH 等外部数据集的影响。实验表明，仅需要与注视相关的 2D 点标签的弱监督方法可以成为有人值守对象检测的完全监督方法的有效替代方法。为了鼓励对该主题的研究，我们在以下网址公开发布代码和数据集：https://iplab.dmi.unict.it/EGO-CH-Gaze/。

更新日期：2024-04-23

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>