Skip to main content
Log in

Real-time attention-based embedded LSTM for dynamic sign language recognition on edge devices

  • Research
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Sign language recognition attempts to recognize meaningful hand gesture movements and is a significant solution for intelligent communication across societies with speech and hearing impairments. Nevertheless, understanding dynamic sign language from video-based data remains a challenging task in hand gesture recognition. However, real-time gesture recognition on low-power edge devices with limited resources has become a topic of research interest. Therefore, this work presents a memory-efficient deep-learning pipeline for identifying dynamic sign language on embedded devices. Specifically, we recover hand posture information to obtain a more discriminative 3D key point representation. Further, these properties are employed as inputs for the proposed attention-based embedded long short-term memory networks. In addition, the Indian Sign Language dataset for calendar months is also proposed. The post-training quantization is performed to reduce the model’s size to improve resource consumption at the edge. The experimental results demonstrate that the developed system has a recognition rate of 99.7% and an inference time of 500 ms on a Raspberry Pi-4 in a real-time environment. Lastly, memory profiling is performed to evaluate the performance of the model on the hardware.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The data supporting the findings of this study are not publicly accessible due to sensitivity concerns. However, interested individuals can obtain the data from the corresponding author by making a reasonable request.

References

  1. Wang, C.C., Ding, Y.C., Chiu, C.T., Huang, C.T., Cheng, Y.Y., Sun, S.Y., Cheng, C.H., Kuo, H.K.: Real-time block-based embedded cnn for ges-ture classification on an fpga. IEEE Trans. Circuits Syst. I Regul. Pap. 68(10), 4182–4193 (2021)

    Article  Google Scholar 

  2. Hikawa, H., Kaida, K.: Novel fpga implementation of hand sign recognition system with som–hebb classifier. IEEE Trans. Circuits Syst. Video Technol. 25(1), 153–166 (2014)

    Article  Google Scholar 

  3. Rastgoo, R., Kiani, K., Escalera, S.: Real-time isolated hand sign language recognition using deep networks and svd. J. Ambient Intell. Human. Comput. 13(1), 591–611 (2022)

    Article  Google Scholar 

  4. Breland, D.S., Skriubakken, S.B., Dayal, A., Jha, A., Yalavarthy, P.K., Cenkeramaddi, L.R.: Deep learning-based sign language digits recognition from thermal images with edge com-puting system. IEEE Sens. J. 21(9), 10445–10453 (2021)

    Article  Google Scholar 

  5. DepPreto, J., Hughes, J., D’Aria, M., de Fazio, M., Ris, D.: A Wearable smart glove and its application of pose and gesture detection to sign language classification. IEEE Robot. Automat. Lett. 7(4), 10589–10596 (2022)

    Article  Google Scholar 

  6. Chen, Y., Zhao, L., Peng, X., Yuan, J., Metaxas, D.N.: Construct dynamic graphs for hand gesture recognition via spatial-temporal attention. arXiv preprint arXiv:1907.08871 (2019)

  7. Huang, J., Zhou, W., Li, H., Li, W.: Attention-based 3D-cnns for large vocabulary sign language recognition. IEEE Trans. Circuits Syst. Video Technol. 29(9), 2822–2832 (2018)

    Article  Google Scholar 

  8. Dayal, A., Paluru, N., Cenkeramaddi, L.R., Yalavarthy, P.K.: Design and implementation of deep learning based contactless authentication systems using hand gestures. Electronics 10(2), 182 (2021)

    Article  Google Scholar 

  9. Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single rgb images. In: Proceesings of the IEEE International Conference on Computer Vision, pp. 4903–4911 (2017)

  10. Abdul, W., Alsulaiman, M., Amin, S.U., Faisal, M., Muhammad, G., Albogamy, F.R., Bencherif, M.A., Ghaleb, H.: Intelligent real-time Arabic sign language classification using attention -based inception and bilstm. Comput. Electr. Eng. 95, 107395 (2021)

    Article  Google Scholar 

  11. Gupta, K., Singh, A., Yeduri, S.R., Srinivas, M., Cenkeramaddi, L.R.: Hand gestures recognition using edge comouting system based on vision transformer and lightweight cnn. J. Ambient Intell. Human. Comput. 14(3), 2601–2615 (2023)

    Article  Google Scholar 

  12. Siddique, S., Islam, S., Neon, E.E., Sabbir, T., Naheen, I.T., Khan, R.: Deep learning-based bangla sign language detection with an edge device. Intell. Syst. Appl. 18, 200224 (2023)

    Google Scholar 

  13. Zhang, M.M., Zhou, Z, Wang, T., Zhou, W.: A Lightweight Network Deployed on ARM Devices for Hand Gesture Recognition. In: IEEE Access, vol. 11, pp. 45493–45503 (2023). https://doi.org/10.1109/ACCESS.2023.3273713

  14. Gao, Z., Lee, C.C., Zheng, L., Zhang, R., Xu, X.: A multitask sign language recognition system using commodity Wi-Fi. Mobile Inform. Syst. 2023, Article ID 7959916, 11 pages (2023). https://doi.org/10.1155/2023/7959916

    Article  Google Scholar 

  15. Sharma, V., Jaiswal, M., Sharma, A., Saini, S., Tomar, R.: Dynamic two hand gesture recognition using cnn-lstm based networks. In: 2021 IEEE Inter-National Symposium on Smart Electronic Systems (iSES), pp. 224–229 (2021)

  16. Ganokratanaa, T., Pumrin, S.: Hand gesture recognition algo-rithm for smart cities based on wireless sensor. Int. J. Online Eng. 13(6), 58 (2017)

    Article  Google Scholar 

  17. Zhang, F., Bazarevsky, V., VAkunoy, A., Tkachenka, A., Sung, G., Chang, C.L., Grundmann, M.: Mediapipe hands: on-device real-time hand tracking. arXiv preprint arXiv:2006.10214 (2020)

  18. Selvaraj, P., Nc, G., Kumar, P., Khapra, M.: Openhands: making sign language recognition accessible with pose-based pretrained models across languages. arXiv preprint arXiv:2110.05877 (2021)

  19. You, J., Korhonen, J.: Attention boosted deep neural networks for video classification, In: 2020 IEEE International Conference on Image Processing (ICIP), IEEE, pp. 1761–1765

  20. Zhang, G., Davoodnia, V., Sepas-Moghaddaam, A., Zhang, Y., Etemad, A.: Classification hand movements from eeg using a deep attention-based lstm network. IEEE Sens. J. 20(6), 3113–3122 (2019)

    Article  Google Scholar 

  21. Greff, K., Srivastava, R.K., Koutnik, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space Odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2016)

    Article  MathSciNet  Google Scholar 

  22. Adi, S.E., Casson, A.J.: Design and optimization of TensorFlow lite deep learning neural network for human activity recognition on a smartphone. In: 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), IEEE, pp. 7028–7031(2021)

  23. Subramanian, B., Olimov, B., Naik, S.M., Kim, S., Park, K.-H., Kim, J.: An integrated mediapipe-optimized gru model for indian sign language recog-nition. Sci. Rep. 12(1), 1–16 (2022)

    Article  Google Scholar 

  24. Samaan, G.H., Wadie, A.R., Attia, A.K., Asaad, A.M., Kamel, A.E., Slim, S.O., Abdallah, M.S., Cho, Y.I.: Mediapipe’s landmarks with rnn for dynamic sign language recognition. Electronics 11(19), 3228 (2022)

    Article  Google Scholar 

  25. Alsaadi, Z., Alshamani, E., Alre-haili, M., Alrashdi, A.A.D., Albelwi, S., Elfaki, A.O.: A real time arabic sign language alphabets (arsla) recognition model using deep learning architecture. Computers 11(5), 78 (2022)

    Article  Google Scholar 

  26. Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. ACM Sigplan Not. 42(6), 89–100 (2007)

    Article  Google Scholar 

  27. Elakkiya, R., Natarajan, B.: ISL-CSLTR: indian sign language dataset for continuous sign language translation and recognition. Mendeley Data. V1 (2021). https://doi.org/10.17632/kcmpdxky7p.1

    Article  Google Scholar 

Download references

Acknowledgements

The project Sign Language to Regional Language Converter (SLRLC) with project number SEED/TIDE/063/2016 is supported by India’s Department of Science and Technology (DST) Government.

Author information

Authors and Affiliations

Authors

Contributions

Methodology, data collection, analysis, and writing manuscript were performed by VS. Technical advice and investigation were analyzed by AS. Reviewing the manuscript is done by SS. All the authors discussed the results and commented on the manuscript.

Corresponding author

Correspondence to Vaidehi Sharma.

Ethics declarations

Conflict of interest

Authors are required to disclose financial interests that are directly or indirectly related to the work submitted for publications.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, V., Sharma, A. & Saini, S. Real-time attention-based embedded LSTM for dynamic sign language recognition on edge devices. J Real-Time Image Proc 21, 53 (2024). https://doi.org/10.1007/s11554-024-01435-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11554-024-01435-7

Keywords

Navigation