Deep Upscale U-Net for automatic tongue segmentation

Kusakunniran, Worapan; Imaromkul, Thanandon; Mongkolluksamee, Sophon; Thongkanchorn, Kittikhun; Ritthipravat, Panrasee; Tuakta, Pimchanok; Benjapornlert, Paitoon

doi:10.1007/s11517-024-03051-w

Deep Upscale U-Net for automatic tongue segmentation

Original Article
Published: 19 February 2024

Volume 62, pages 1751–1762, (2024)
Cite this article

Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Worapan Kusakunniran ORCID: orcid.org/0000-0002-2896-611X¹,
Thanandon Imaromkul¹^na1,
Sophon Mongkolluksamee²^na1,
Kittikhun Thongkanchorn¹^na1,
Panrasee Ritthipravat³^na1,
Pimchanok Tuakta⁴^na1 &
…
Paitoon Benjapornlert⁴

223 Accesses
Explore all metrics

Abstract

In a treatment or diagnosis related to oral health conditions such as oral cancer and oropharyngeal cancer, an investigation of tongue’s movements is a major part. In an automatic measurement of such movement, it must first start with a task of tongue segmentation. This paper proposes a solution of tongue segmentation based on a decoder-encoder CNN-based structure i.e., U-Net. However, it could suffer from a problem of feature loss in deep layers. This paper proposes a Deep Upscale U-Net (DU-UNET). An additional up-sampling of the feature map from a contracting path is concatenated to an upper layer of an expansive path, based on an original U-Net structure. The segmentation model is constructed by training DU-UNET on the two publicly available datasets, and transferred to the self-collected dataset of tongue images with five tongue postures which were recorded at a far distance from a camera under a real-world scenario. The proposed DU-UNET outperforms the other existing methods in our literature reviews, with accuracy of 99.2%, mean IoU of 97.8%, Dice score of 96.8%, and Jaccard score of 96.8%.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DSE-Net: Deep Semantic Enhanced Network for Mobile Tongue Image Segmentation

Encoder-decoder network with RMP for tongue segmentation

Article 24 January 2023

Automated Tongue Segmentation in Chinese Medicine Based on Deep Learning

References

Zaghi S, Shamtoob S, Peterson C, Christianson L, Valcu-Pinkerton S, Peeran Z, Fung B, Kwok-keung Ng D, Jagomagi T, Archambault N et al (2021) Assessment of posterior tongue mobility using lingual-palatal suction: progress towards a functional definition of ankyloglossia. J Oral Rehabil 48(6):692–700
Article PubMed PubMed Central Google Scholar
Xie J, Jing C, Zhang Z, Xu J, Duan Y, Xu D (2021) Digital tongue image analyses for health assessment. Med Rev 1(2):172–198
Article Google Scholar
Xue Y, Li X, Wu P, Li J, Wang L, Tong W (2018) Automated tongue segmentation in Chinese medicine based on deep learning. In: Neural information processing: 25th international conference, ICONIP 2018, Siem Reap, Cambodia, December 13–16, 2018, Proceedings, Part VII 25, pp 542–553. Springer
Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587
Shen X, Tao X, Gao H, Zhou C, Jia J (2016) Deep automatic portrait matting. In: Computer vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp 92–107. Springer
Zheng Y, Kambhamettu C (2009) Learning based digital matting. In: 2009 IEEE 12th international conference on computer vision. IEEE, pp 889–896
Lin B, Xie J, Li C, Qu Y (2018) DeepTongue: tongue segmentation via ResNet. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1035–1039
Rother C, Kolmogorov V, Blake A (2004) “grabcut’’ interactive foreground extraction using iterated graph cuts. ACM Trans Graph (TOG) 23(3):309–314
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778
Zhou C, Fan H, Li Z (2019) TongueNet: accurate localization and segmentation for tongue images using deep neural networks. IEEE Access 7:148779–148789
Article Google Scholar
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Advances in neural information processing systems 28
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision. pp 2961–2969
Wei L, Jinming C, Bo L, Wei H, Xingjin W, Hui Z (2022) Tongue image segmentation and tongue color classification based on deep learning. Digit Chin Med 5(3):253–263
Article Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, pp 234–241
Kusakunniran W, Borwarnginn P, Karnjanapreechakorn S, Thongkanchorn K, Ritthipravat P, Tuakta P, Benjapornlert P (2023) Encoder-decoder network with RMP for tongue segmentation. Med Biol Eng Comput 61(5):1193–1207
Article PubMed Google Scholar
Kusakunniran W, Borwarnginn P, Imaromkul T, Aukkapinyo K, Thongkanchorn K, Wattanadhirach D, Mongkolluksamee S, Thammasudjarit R, Ritthipravat P, Tuakta P et al (2023) Automated tongue segmentation using deep encoder-decoder model. Multimed Tools Appl 1–26
Marhamati M, Zadeh AAL, Fard MM, Hussain MA, Jafarnezhad K, Jafarnezhad A, Bakhtoor M, Momeny M (2023) LAIU-Net: a learning-to-augment incorporated robust U-Net for depressed humans’ tongue segmentation. Displays 76:102371
Article Google Scholar
BioHit (2014) Tongeimagedataset. GitHub
Shi D, Tang C, Blackley SV, Wang L, Yang J, He Y, Bennett SI, Xiong Y, Shi X, Zhou L et al (2020) An annotated dataset of tongue images supporting geriatric disease diagnosis. Data Brief 32:106153
Article PubMed PubMed Central Google Scholar
Tang C (2019) Replication data for: an annotated dataset of tongue images
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv:1412.7062
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article PubMed Google Scholar
Cai Y, Wang T, Liu W, Luo Z (2020) A robust interclass and intraclass loss function for deep learning based tongue segmentation. Concurr Comput Pract Exp 32(22):5849
Article Google Scholar
O Pinheiro PO, Collobert R, Dollár P (2015) Learning to segment object candidates. Advances in neural information processing systems 28
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) ENet: a deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV). pp 552–568
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3431–3440
Guo J, Yang Y, Wu Q, Su J, Ma F (2016) Adaptive active contour model based automatic tongue image segmentation. In: 2016 9th International congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI). IEEE, pp 1386–1390
Li X, Yang T, Hu Y, Xu M, Zhang W, Li F (2017) Automatic tongue image matting for remote medical diagnosis. In: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, pp 561–564
Liu W, Zhou C, Li Z, Hu Z (2020) Patch-driven tongue image segmentation using sparse representation. IEEE Access 8:41372–41383
Article Google Scholar
Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article PubMed Google Scholar

Download references

Acknowledgements

This research project is supported by Mahidol University (Basic Research Fund: fiscal year 2022) (FRB650007/0185) (Contract No BRF1-056/2565).

Author information

Thanandon Imaromkul, Sophon Mongkolluksamee, Kittikhun Thongkanchorn, Panrasee Ritthipravat, Pimchanok Tuakta, and Paitoon Benjapornlert contributed equally to this work.

Authors and Affiliations

Faculty of Information and Communication Technology, Mahidol University, 999 Phuttamonthon 4 Road, Salaya, 73170, Nakhon Pathom, Thailand
Worapan Kusakunniran, Thanandon Imaromkul & Kittikhun Thongkanchorn
Department of Computer Science, Faculty of Science, Srinakharinwirot University, 114 Sukhumvit 23, 10110, Bangkok, Thailand
Sophon Mongkolluksamee
Department of Biomedical Engineering, Faculty of Engineering, Mahidol University, 999 Phuttamonthon 4 Road, Salaya, 73170, Nakhon Pathom, Thailand
Panrasee Ritthipravat
Department of Rehabilitation Medicine, Faculty of Medicine Ramathibodi Hospital, Mahidol University, 270 Rama 6 Road, 10400, Bangkok, Thailand
Pimchanok Tuakta & Paitoon Benjapornlert

Authors

Worapan Kusakunniran
View author publications
You can also search for this author in PubMed Google Scholar
Thanandon Imaromkul
View author publications
You can also search for this author in PubMed Google Scholar
Sophon Mongkolluksamee
View author publications
You can also search for this author in PubMed Google Scholar
Kittikhun Thongkanchorn
View author publications
You can also search for this author in PubMed Google Scholar
Panrasee Ritthipravat
View author publications
You can also search for this author in PubMed Google Scholar
Pimchanok Tuakta
View author publications
You can also search for this author in PubMed Google Scholar
Paitoon Benjapornlert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Worapan Kusakunniran or Panrasee Ritthipravat.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kusakunniran, W., Imaromkul, T., Mongkolluksamee, S. et al. Deep Upscale U-Net for automatic tongue segmentation. Med Biol Eng Comput 62, 1751–1762 (2024). https://doi.org/10.1007/s11517-024-03051-w

Download citation

Received: 11 July 2023
Accepted: 13 February 2024
Published: 19 February 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s11517-024-03051-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions