当前位置: X-MOL 学术International Journal on Digital Libraries › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ORKG-Leaderboards: a systematic workflow for mining leaderboards as a knowledge graph
International Journal on Digital Libraries Pub Date : 2023-06-15 , DOI: 10.1007/s00799-023-00366-1
Salomon Kabongo , Jennifer D’Souza , Sören Auer

The purpose of this work is to describe the orkg-Leaderboard software designed to extract leaderboards defined as task–dataset–metric tuples automatically from large collections of empirical research papers in artificial intelligence (AI). The software can support both the main workflows of scholarly publishing, viz. as LaTeX files or as PDF files. Furthermore, the system is integrated with the open research knowledge graph (ORKG) platform, which fosters the machine-actionable publishing of scholarly findings. Thus, the systemsss output, when integrated within the ORKG’s supported Semantic Web infrastructure of representing machine-actionable ‘resources’ on the Web, enables: (1) broadly, the integration of empirical results of researchers across the world, thus enabling transparency in empirical research with the potential to also being complete contingent on the underlying data source(s) of publications; and (2) specifically, enables researchers to track the progress in AI with an overview of the state-of-the-art across the most common AI tasks and their corresponding datasets via dynamic ORKG frontend views leveraging tables and visualization charts over the machine-actionable data. Our best model achieves performances above 90% F1 on the leaderboard extraction task, thus proving orkg-Leaderboards a practically viable tool for real-world usage. Going forward, in a sense, orkg-Leaderboards transforms the leaderboard extraction task to an automated digitalization task, which has been, for a long time in the community, a crowdsourced endeavor.



中文翻译:

ORKG-Leaderboards:将排行榜挖掘为知识图的系统化工作流程

这项工作的目的是描述orkg -Leaderboard 软件,旨在提取定义为任务-数据集-指标的排行榜从人工智能 (AI) 领域的大量实证研究论文中自动提取元组。该软件可以支持学术出版的主要工作流程,即。作为 LaTeX 文件或 PDF 文件。此外,该系统与开放研究知识图谱(ORKG)平台集成,促进了学术成果的机器可操作发布。因此,当将 Systemsss 输出集成到 ORKG 支持的表示网络上机器可操作“资源”的语义 Web 基础设施中时,它能够:(1)广泛地整合世界各地研究人员的实证结果,从而实现实证研究的透明度。研究也有可能完全取决于出版物的基础数据源;(2) 具体来说,使研究人员能够通过利用机器可操作数据上的表格和可视化图表的动态 ORKG 前端视图来跟踪 AI 的进展,并概述最常见的 AI 任务及其相应数据集的最新技术。我们最好的模型在 F1 上实现了 90% 以上的性能排行榜提取任务,从而证明orkg -Leaderboards 是现实世界中实际可行的工具。展望未来,从某种意义上说,orkg -Leaderboards 将排行榜提取任务转变为自动化数字化任务,这在社区中很长一段时间以来都是众包的工作。

更新日期:2023-06-19
down
wechat
bug