当前位置:
X-MOL 学术
›
arXiv.cs.SD
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimizing Two-Pass Cross-Lingual Transfer Learning: Phoneme Recognition and Phoneme to Grapheme Translation
arXiv - CS - Sound Pub Date : 2023-12-06 , DOI: arxiv-2312.03312 Wonjun Lee, Gary Geunbae Lee, Yunsu Kim
arXiv - CS - Sound Pub Date : 2023-12-06 , DOI: arxiv-2312.03312 Wonjun Lee, Gary Geunbae Lee, Yunsu Kim
This research optimizes two-pass cross-lingual transfer learning in
low-resource languages by enhancing phoneme recognition and phoneme-to-grapheme
translation models. Our approach optimizes these two stages to improve speech
recognition across languages. We optimize phoneme vocabulary coverage by
merging phonemes based on shared articulatory characteristics, thus improving
recognition accuracy. Additionally, we introduce a global phoneme noise
generator for realistic ASR noise during phoneme-to-grapheme training to reduce
error propagation. Experiments on the CommonVoice 12.0 dataset show significant
reductions in Word Error Rate (WER) for low-resource languages, highlighting
the effectiveness of our approach. This research contributes to the
advancements of two-pass ASR systems in low-resource languages, offering the
potential for improved cross-lingual transfer learning.
中文翻译:
优化两遍跨语言迁移学习:音素识别和音素到字素翻译
这项研究通过增强音素识别和音素到字素翻译模型来优化低资源语言中的两遍跨语言迁移学习。我们的方法优化了这两个阶段,以提高跨语言的语音识别能力。我们通过基于共享发音特征合并音素来优化音素词汇覆盖范围,从而提高识别准确性。此外,我们还引入了一个全局音素噪声生成器,可在音素到字素训练期间产生真实的 ASR 噪声,以减少错误传播。CommonVoice 12.0 数据集上的实验表明,低资源语言的词错误率 (WER) 显着降低,凸显了我们方法的有效性。这项研究有助于低资源语言中两遍 ASR 系统的进步,为改进跨语言迁移学习提供了潜力。
更新日期:2023-12-07
中文翻译:
优化两遍跨语言迁移学习:音素识别和音素到字素翻译
这项研究通过增强音素识别和音素到字素翻译模型来优化低资源语言中的两遍跨语言迁移学习。我们的方法优化了这两个阶段,以提高跨语言的语音识别能力。我们通过基于共享发音特征合并音素来优化音素词汇覆盖范围,从而提高识别准确性。此外,我们还引入了一个全局音素噪声生成器,可在音素到字素训练期间产生真实的 ASR 噪声,以减少错误传播。CommonVoice 12.0 数据集上的实验表明,低资源语言的词错误率 (WER) 显着降低,凸显了我们方法的有效性。这项研究有助于低资源语言中两遍 ASR 系统的进步,为改进跨语言迁移学习提供了潜力。