An Approach for Automated Kannada Subtitle Generation from Kannada Video,International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems

当前位置： X-MOL 学术 › Int. J. Uncertain. Fuzziness Knowl. Based Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An Approach for Automated Kannada Subtitle Generation from Kannada Video
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems ( IF 1.5 ) Pub Date : 2023-05-19 , DOI: 10.1142/s0218488523400068
Santosh _{1,

2} , L. M. Jenila Livingston ₁

Affiliation

This paper presents an automated Kannada subtitle generator from Kannada video which is implemented to assist people with auditory problems for watching videos. Henceforth the subtitle generation has become an important task for supporting such special people and it integrates an audio extraction and a speech recognition module. Three phases of the proposed technique were implemented, such as extracting audio from video, Recognition of Speech and Generation of Subtitle. An adaptive speech recognition module is implemented AMFCC for feature extraction which was an alternative to the most commonly used FFT. Hankel transform which was similar to FFT, but includes no elementary particles such as FFT. In addition to it, in the decoder acoustic module, such as Adaptive Hidden Markov Model using the Baum-Welch algorithm is utilized instead of a Viterbi algorithm to reduce the computational time and memory usage. The text file from the speech recognition module is rendered to synchronize the missing offset with the video using parallel processing by defining the start time, the end time, the delay time. Best outcomes are demonstrated by the experimental results of the proposed technique with 98.4% of accuracy compare with existing techniques. The proposed technique which gives 3.8% better accuracy performance compare with existing technique i.e. MFCC, DNN and CNN.

中文翻译：

一种从卡纳达语视频自动生成卡纳达语字幕的方法

本文介绍了一种来自 Kannada 视频的自动 Kannada 字幕生成器，该字幕生成器用于帮助有听觉问题的人观看视频。从此以后，字幕生成已成为支持此类特殊人群的重要任务，它集成了音频提取和语音识别模块。实施了所提出技术的三个阶段，例如从视频中提取音频、语音识别和字幕生成。自适应语音识别模块实现了 AMFCC 用于特征提取，这是最常用的 FFT 的替代方案。Hankel变换类似于FFT，但不包含FFT等基本粒子。除此之外，在解码器声学模块中，例如使用 Baum-Welch 算法的自适应隐马尔可夫模型代替维特比算法来减少计算时间和内存使用。通过定义开始时间、结束时间、延迟时间，使用并行处理呈现来自语音识别模块的文本文件以将丢失的偏移量与视频同步。与现有技术相比，所提出技术的实验结果证明了最佳结果，准确率为 98.4%。与现有技术（即 MFCC、DNN 和 CNN）相比，所提出的技术可提供 3.8% 更好的精度性能。延迟时间。与现有技术相比，所提出技术的实验结果证明了最佳结果，准确率为 98.4%。与现有技术（即 MFCC、DNN 和 CNN）相比，所提出的技术可提供 3.8% 更好的精度性能。延迟时间。与现有技术相比，所提出技术的实验结果证明了最佳结果，准确率为 98.4%。与现有技术（即 MFCC、DNN 和 CNN）相比，所提出的技术可提供 3.8% 更好的精度性能。

更新日期：2023-05-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>