当前位置:
X-MOL 学术
›
arXiv.cs.SD
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based Speech Recognition
arXiv - CS - Sound Pub Date : 2024-01-04 , DOI: arxiv-2401.02046 Junfeng Hou, Peiyao Wang, Jincheng Zhang, Meng Yang, Minwei Feng, Jingcheng Yin
arXiv - CS - Sound Pub Date : 2024-01-04 , DOI: arxiv-2401.02046 Junfeng Hou, Peiyao Wang, Jincheng Zhang, Meng Yang, Minwei Feng, Jingcheng Yin
Deploying end-to-end speech recognition models with limited computing
resources remains challenging, despite their impressive performance. Given the
gradual increase in model size and the wide range of model applications,
selectively executing model components for different inputs to improve the
inference efficiency is of great interest. In this paper, we propose a dynamic
layer-skipping method that leverages the CTC blank output from intermediate
layers to trigger the skipping of the last few encoder layers for frames with
high blank probabilities. Furthermore, we factorize the CTC output distribution
and perform knowledge distillation on intermediate layers to reduce computation
and improve recognition accuracy. Experimental results show that by utilizing
the CTC blank, the encoder layer depth can be adjusted dynamically, resulting
in 29% acceleration of the CTC model inference with minor performance
degradation.
中文翻译:
CTC 空白触发动态跳层,实现基于 CTC 的高效语音识别
尽管性能令人印象深刻,但在计算资源有限的情况下部署端到端语音识别模型仍然具有挑战性。鉴于模型规模的逐渐增加和模型应用的广泛,针对不同输入选择性地执行模型组件以提高推理效率非常有意义。在本文中,我们提出了一种动态跳层方法,该方法利用中间层的 CTC 空白输出来触发跳过具有高空白概率的帧的最后几个编码器层。此外,我们对CTC输出分布进行因子分解,并对中间层进行知识蒸馏,以减少计算量并提高识别精度。实验结果表明,利用CTC空白,可以动态调整编码器层深度,使CTC模型推理速度提高29%,而性能下降较小。
更新日期:2024-01-06
中文翻译:
CTC 空白触发动态跳层,实现基于 CTC 的高效语音识别
尽管性能令人印象深刻,但在计算资源有限的情况下部署端到端语音识别模型仍然具有挑战性。鉴于模型规模的逐渐增加和模型应用的广泛,针对不同输入选择性地执行模型组件以提高推理效率非常有意义。在本文中,我们提出了一种动态跳层方法,该方法利用中间层的 CTC 空白输出来触发跳过具有高空白概率的帧的最后几个编码器层。此外,我们对CTC输出分布进行因子分解,并对中间层进行知识蒸馏,以减少计算量并提高识别精度。实验结果表明,利用CTC空白,可以动态调整编码器层深度,使CTC模型推理速度提高29%,而性能下降较小。