Skip to main content
Log in

Deep Upscale U-Net for automatic tongue segmentation

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

In a treatment or diagnosis related to oral health conditions such as oral cancer and oropharyngeal cancer, an investigation of tongue’s movements is a major part. In an automatic measurement of such movement, it must first start with a task of tongue segmentation. This paper proposes a solution of tongue segmentation based on a decoder-encoder CNN-based structure i.e., U-Net. However, it could suffer from a problem of feature loss in deep layers. This paper proposes a Deep Upscale U-Net (DU-UNET). An additional up-sampling of the feature map from a contracting path is concatenated to an upper layer of an expansive path, based on an original U-Net structure. The segmentation model is constructed by training DU-UNET on the two publicly available datasets, and transferred to the self-collected dataset of tongue images with five tongue postures which were recorded at a far distance from a camera under a real-world scenario. The proposed DU-UNET outperforms the other existing methods in our literature reviews, with accuracy of 99.2%, mean IoU of 97.8%, Dice score of 96.8%, and Jaccard score of 96.8%.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Zaghi S, Shamtoob S, Peterson C, Christianson L, Valcu-Pinkerton S, Peeran Z, Fung B, Kwok-keung Ng D, Jagomagi T, Archambault N et al (2021) Assessment of posterior tongue mobility using lingual-palatal suction: progress towards a functional definition of ankyloglossia. J Oral Rehabil 48(6):692–700

    Article  PubMed  PubMed Central  Google Scholar 

  2. Xie J, Jing C, Zhang Z, Xu J, Duan Y, Xu D (2021) Digital tongue image analyses for health assessment. Med Rev 1(2):172–198

    Article  Google Scholar 

  3. Xue Y, Li X, Wu P, Li J, Wang L, Tong W (2018) Automated tongue segmentation in Chinese medicine based on deep learning. In: Neural information processing: 25th international conference, ICONIP 2018, Siem Reap, Cambodia, December 13–16, 2018, Proceedings, Part VII 25, pp 542–553. Springer

  4. Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587

  5. Shen X, Tao X, Gao H, Zhou C, Jia J (2016) Deep automatic portrait matting. In: Computer vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp 92–107. Springer

  6. Zheng Y, Kambhamettu C (2009) Learning based digital matting. In: 2009 IEEE 12th international conference on computer vision. IEEE, pp 889–896

  7. Lin B, Xie J, Li C, Qu Y (2018) DeepTongue: tongue segmentation via ResNet. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1035–1039

  8. Rother C, Kolmogorov V, Blake A (2004) “grabcut’’ interactive foreground extraction using iterated graph cuts. ACM Trans Graph (TOG) 23(3):309–314

    Article  Google Scholar 

  9. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778

  10. Zhou C, Fan H, Li Z (2019) TongueNet: accurate localization and segmentation for tongue images using deep neural networks. IEEE Access 7:148779–148789

    Article  Google Scholar 

  11. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Advances in neural information processing systems 28

  12. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision. pp 2961–2969

  13. Wei L, Jinming C, Bo L, Wei H, Xingjin W, Hui Z (2022) Tongue image segmentation and tongue color classification based on deep learning. Digit Chin Med 5(3):253–263

    Article  Google Scholar 

  14. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, pp 234–241

  15. Kusakunniran W, Borwarnginn P, Karnjanapreechakorn S, Thongkanchorn K, Ritthipravat P, Tuakta P, Benjapornlert P (2023) Encoder-decoder network with RMP for tongue segmentation. Med Biol Eng Comput 61(5):1193–1207

    Article  PubMed  Google Scholar 

  16. Kusakunniran W, Borwarnginn P, Imaromkul T, Aukkapinyo K, Thongkanchorn K, Wattanadhirach D, Mongkolluksamee S, Thammasudjarit R, Ritthipravat P, Tuakta P et al (2023) Automated tongue segmentation using deep encoder-decoder model. Multimed Tools Appl 1–26

  17. Marhamati M, Zadeh AAL, Fard MM, Hussain MA, Jafarnezhad K, Jafarnezhad A, Bakhtoor M, Momeny M (2023) LAIU-Net: a learning-to-augment incorporated robust U-Net for depressed humans’ tongue segmentation. Displays 76:102371

    Article  Google Scholar 

  18. BioHit (2014) Tongeimagedataset. GitHub

  19. Shi D, Tang C, Blackley SV, Wang L, Yang J, He Y, Bennett SI, Xiong Y, Shi X, Zhou L et al (2020) An annotated dataset of tongue images supporting geriatric disease diagnosis. Data Brief 32:106153

    Article  PubMed  PubMed Central  Google Scholar 

  20. Tang C (2019) Replication data for: an annotated dataset of tongue images

  21. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv:1412.7062

  22. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  PubMed  Google Scholar 

  23. Cai Y, Wang T, Liu W, Luo Z (2020) A robust interclass and intraclass loss function for deep learning based tongue segmentation. Concurr Comput Pract Exp 32(22):5849

    Article  Google Scholar 

  24. O Pinheiro PO, Collobert R, Dollár P (2015) Learning to segment object candidates. Advances in neural information processing systems 28

  25. Paszke A, Chaurasia A, Kim S, Culurciello E (2016) ENet: a deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147

  26. Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV). pp 552–568

  27. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3431–3440

  28. Guo J, Yang Y, Wu Q, Su J, Ma F (2016) Adaptive active contour model based automatic tongue image segmentation. In: 2016 9th International congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI). IEEE, pp 1386–1390

  29. Li X, Yang T, Hu Y, Xu M, Zhang W, Li F (2017) Automatic tongue image matting for remote medical diagnosis. In: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, pp 561–564

  30. Liu W, Zhou C, Li Z, Hu Z (2020) Patch-driven tongue image segmentation using sparse representation. IEEE Access 8:41372–41383

    Article  Google Scholar 

  31. Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This research project is supported by Mahidol University (Basic Research Fund: fiscal year 2022) (FRB650007/0185) (Contract No BRF1-056/2565).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Worapan Kusakunniran or Panrasee Ritthipravat.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kusakunniran, W., Imaromkul, T., Mongkolluksamee, S. et al. Deep Upscale U-Net for automatic tongue segmentation. Med Biol Eng Comput 62, 1751–1762 (2024). https://doi.org/10.1007/s11517-024-03051-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-024-03051-w

Keywords

Navigation