SoftCTC—semi-supervised learning for text recognition using soft pseudo-labels,International Journal on Document Analysis and Recognition

当前位置： X-MOL 学术 › Int. J. Doc. Anal. Recognit. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

SoftCTC—semi-supervised learning for text recognition using soft pseudo-labels
International Journal on Document Analysis and Recognition ( IF 2.3 ) Pub Date : 2023-10-06 , DOI: 10.1007/s10032-023-00452-9
Martin Kišš , Michal Hradiš , Karel Beneš , Petr Buchal , Michal Kula

This paper explores semi-supervised training for sequence tasks, such as optical character recognition or automatic speech recognition. We propose a novel loss function—SoftCTC—which is an extension of CTC allowing to consider multiple transcription variants at the same time. This allows to omit the confidence-based filtering step which is otherwise a crucial component of pseudo-labeling approaches to semi-supervised learning. We demonstrate the effectiveness of our method on a challenging handwriting recognition task and conclude that SoftCTC matches the performance of a finely tuned filtering-based pipeline. We also evaluated SoftCTC in terms of computational efficiency, concluding that it is significantly more efficient than a naïve CTC-based approach for training on multiple transcription variants, and we make our GPU implementation public.

中文翻译：

SoftCTC——使用软伪标签进行文本识别的半监督学习

本文探讨了序列任务的半监督训练，例如光学字符识别或自动语音识别。我们提出了一种新颖的损失函数——SoftCTC——它是 CTC 的扩展，允许同时考虑多个转录变体。这允许省略基于置信度的过滤步骤，否则该步骤是半监督学习的伪标记方法的关键组成部分。我们展示了我们的方法在具有挑战性的手写识别任务上的有效性，并得出结论，SoftCTC 与基于过滤的微调管道的性能相匹配。我们还评估了 SoftCTC 的计算效率，得出的结论是，在多种转录变体的训练方面，它比基于 CTC 的简单方法要高效得多，并且我们公开了我们的 GPU 实现。

更新日期：2023-10-06

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>