当前位置: X-MOL 学术Collection and Curation › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
metaGraphos: a Web-based system for transcribing, proofreading and publishing scanned documents
Collection and Curation Pub Date : 2023-04-10 , DOI: 10.1108/cc-01-2023-0002
Evagelos Varthis , Marios Poulos

Purpose

This study aims to present metaGraphos, a crowdsourcing system that aids in the transcription and semantic enhancement of scanned documents by using a pool of volunteers or people willing to participate in exchange for a financial reward.

Design/methodology/approach

The metaGraphos can be used in circumstances where optical character recognition fails to produce satisfactory results, semantic tagging or assigning thematic headings to texts is considered necessary or even when ground-truth data has to be collected in raw form.

Findings

The system automatically provides a Web-based interface comprising a static HTML page and JavaScript code that displays the scanned images of the document, coupled with the corresponding incomplete texts side by side, allowing users to correct or complete the texts in parallel.

Social implications

By assisting the parallel transcription and the semantic enhancement of difficult scanned documents, the system further reveals the hidden cultural wealth and aids in knowledge dissemination, a fact that contributes significantly to the academic-scientific dialog and feedback.

Originality/value

Individual researchers, libraries and organizations in general may benefit from the system because it is cost-effective, practical and simple to set up client–server architecture that provides a reliable way to transcribe texts or revise transcriptions on a large scale.



中文翻译:

metaGraphos:一个基于网络的系统,用于转录、校对和发布扫描文档

目的

本研究旨在展示metaGraphos,这是一种众包系统,通过使用一群志愿者或愿意参与以换取经济奖励的人来帮助扫描文档的转录和语义增强。

设计/方法论/途径

metaGraphos 可用于光学字符识别无法产生令人满意的结果、语义标记或为文本分配主题标题被认为是必要的情况,甚至当必须以原始形式收集真实数据时。

发现

系统自动提供基于 Web 的界面,包括静态 HTML 页面和 JavaScript 代码,用于显示文档的扫描图像,并并排显示相应的不完整文本,允许用户并行更正或完成文本。

社会影响

通过协助并行转录和困难扫描文档的语义增强,该系统进一步揭示了隐藏的文化财富并有助于知识传播,这一事实对学术与科学的对话和反馈做出了重大贡献。

原创性/价值

个人研究人员、图书馆和一般组织可能会从该系统中受益,因为它具有成本效益、实用性和简单性,可以建立客户端-服务器架构,提供可靠的方式来转录文本或大规模修改转录。

更新日期:2023-04-10
down
wechat
bug