当前位置: X-MOL 学术J. Ambient Intell. Human. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sign language detection using convolutional neural network
Journal of Ambient Intelligence and Humanized Computing ( IF 3.662 ) Pub Date : 2024-03-26 , DOI: 10.1007/s12652-024-04761-7
Pranati Rakshit , Sarbajeet Paul , Shruti Dey

Sign language recognition is an important social issue to be addressed which can benefit the deaf and hard of hearing community by providing easier and faster communication. Some previous studies on sign language recognition have used complex input modalities and feature extraction methods, limiting their practical applicability. This research aims to compare two custom-made convolutional neural network (CNN) models for recognizing American Sign Language (ASL) letters from A to Z, and determine which model performs better. The proposed models utilize a combination of CNN and Softmax activation function, which are powerful and widely used classification methods in the field of computer vision. The purpose of the proposed study is to compare the performance of two specially created CNN models for identifying 26 distinct hand signals that represent the 26 English alphabets. The study found that Model_2 had better overall performance than Model_1, with an accuracy of 98.44% and F1 score 98.41%. However, the performance of each model varied depending on the specific label, suggesting that the choice of model may depend on the specific use case and the labels of interest. This research contributes to the growing field of sign language recognition using deep learning techniques and highlights the importance of designing custom models.



中文翻译:

使用卷积神经网络进行手语检测

手语识别是一个需要解决的重要社会问题,它可以通过提供更轻松、更快捷的沟通来使聋哑和听力障碍社区受益。先前的一些手语识别研究使用了复杂的输入模式和特征提取方法,限制了其实际适用性。本研究旨在比较两个定制的卷积神经网络 (CNN) 模型,用于识别从 A 到 Z 的美国手语 (ASL) 字母,并确定哪个模型表现更好。所提出的模型结合了 CNN 和 Softmax 激活函数,这是计算机视觉领域强大且广泛使用的分类方法。本研究的目的是比较两个专门创建的 CNN 模型的性能,用于识别代表 26 个英文字母的 26 个不同的手势信号。研究发现,Model_2 的整体性能优于 Model_1,准确率为 98.44%,F1 得分为 98.41%。然而,每个模型的性能因特定标签而异,这表明模型的选择可能取决于特定用例和感兴趣的标签。这项研究有助于利用深度学习技术不断发展的手语识别领域,并强调了设计定制模型的重要性。

更新日期:2024-03-26
down
wechat
bug