当前位置: X-MOL 学术Int. J. Doc. Anal. Recognit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A holistic approach for image-to-graph: application to optical music recognition
International Journal on Document Analysis and Recognition ( IF 2.3 ) Pub Date : 2022-09-16 , DOI: 10.1007/s10032-022-00417-4
Carlos Garrido-Munoz , Antonio Rios-Vila , Jorge Calvo-Zaragoza

A number of applications would benefit from neural approaches that are capable of generating graphs from images in an end-to-end fashion. One of these fields is optical music recognition (OMR), which focuses on the computational reading of music notation from document images. Given that music notation can be expressed as a graph, the aforementioned approach represents a promising solution for OMR. In this work, we propose a new neural architecture that retrieves a certain representation of a graph—identified by a specific order of its vertices—in an end-to-end manner. This architecture works by means of a double output: It sequentially predicts the possible categories of the vertices, along with the edges between each of their pairs. The experiments carried out prove the effectiveness of our proposal as regards retrieving graph structures from excerpts of handwritten musical notation. Our results also show that certain design decisions, such as the choice of graph representations, play a fundamental role in the performance of this approach.



中文翻译:

图像到图形的整体方法:应用于光学音乐识别

许多应用程序将受益于能够以端到端方式从图像生成图形的神经方法。其中一个领域是光学音乐识别 (OMR),它专注于从文档图像中计算读取音乐符号。鉴于音乐符号可以表示为图表,上述方法代表了 OMR 的一个有前途的解决方案。在这项工作中,我们提出了一种新的神经架构,它以端到端的方式检索图的特定表示——由其顶点的特定顺序标识——。这种架构通过双重输出工作:它顺序预测顶点的可能类别,以及它们每对之间的边。进行的实验证明了我们的提议在从手写乐谱的摘录中检索图形结构方面的有效性。我们的结果还表明,某些设计决策,例如图形表示的选择,在这种方法的性能中发挥着重要作用。

更新日期:2022-09-16
down
wechat
bug