Abstract
In a treatment or diagnosis related to oral health conditions such as oral cancer and oropharyngeal cancer, an investigation of tongue’s movements is a major part. In an automatic measurement of such movement, it must first start with a task of tongue segmentation. This paper proposes a solution of tongue segmentation based on a decoder-encoder CNN-based structure i.e., U-Net. However, it could suffer from a problem of feature loss in deep layers. This paper proposes a Deep Upscale U-Net (DU-UNET). An additional up-sampling of the feature map from a contracting path is concatenated to an upper layer of an expansive path, based on an original U-Net structure. The segmentation model is constructed by training DU-UNET on the two publicly available datasets, and transferred to the self-collected dataset of tongue images with five tongue postures which were recorded at a far distance from a camera under a real-world scenario. The proposed DU-UNET outperforms the other existing methods in our literature reviews, with accuracy of 99.2%, mean IoU of 97.8%, Dice score of 96.8%, and Jaccard score of 96.8%.
Graphical abstract
Similar content being viewed by others
References
Zaghi S, Shamtoob S, Peterson C, Christianson L, Valcu-Pinkerton S, Peeran Z, Fung B, Kwok-keung Ng D, Jagomagi T, Archambault N et al (2021) Assessment of posterior tongue mobility using lingual-palatal suction: progress towards a functional definition of ankyloglossia. J Oral Rehabil 48(6):692–700
Xie J, Jing C, Zhang Z, Xu J, Duan Y, Xu D (2021) Digital tongue image analyses for health assessment. Med Rev 1(2):172–198
Xue Y, Li X, Wu P, Li J, Wang L, Tong W (2018) Automated tongue segmentation in Chinese medicine based on deep learning. In: Neural information processing: 25th international conference, ICONIP 2018, Siem Reap, Cambodia, December 13–16, 2018, Proceedings, Part VII 25, pp 542–553. Springer
Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587
Shen X, Tao X, Gao H, Zhou C, Jia J (2016) Deep automatic portrait matting. In: Computer vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp 92–107. Springer
Zheng Y, Kambhamettu C (2009) Learning based digital matting. In: 2009 IEEE 12th international conference on computer vision. IEEE, pp 889–896
Lin B, Xie J, Li C, Qu Y (2018) DeepTongue: tongue segmentation via ResNet. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1035–1039
Rother C, Kolmogorov V, Blake A (2004) “grabcut’’ interactive foreground extraction using iterated graph cuts. ACM Trans Graph (TOG) 23(3):309–314
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778
Zhou C, Fan H, Li Z (2019) TongueNet: accurate localization and segmentation for tongue images using deep neural networks. IEEE Access 7:148779–148789
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Advances in neural information processing systems 28
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision. pp 2961–2969
Wei L, Jinming C, Bo L, Wei H, Xingjin W, Hui Z (2022) Tongue image segmentation and tongue color classification based on deep learning. Digit Chin Med 5(3):253–263
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, pp 234–241
Kusakunniran W, Borwarnginn P, Karnjanapreechakorn S, Thongkanchorn K, Ritthipravat P, Tuakta P, Benjapornlert P (2023) Encoder-decoder network with RMP for tongue segmentation. Med Biol Eng Comput 61(5):1193–1207
Kusakunniran W, Borwarnginn P, Imaromkul T, Aukkapinyo K, Thongkanchorn K, Wattanadhirach D, Mongkolluksamee S, Thammasudjarit R, Ritthipravat P, Tuakta P et al (2023) Automated tongue segmentation using deep encoder-decoder model. Multimed Tools Appl 1–26
Marhamati M, Zadeh AAL, Fard MM, Hussain MA, Jafarnezhad K, Jafarnezhad A, Bakhtoor M, Momeny M (2023) LAIU-Net: a learning-to-augment incorporated robust U-Net for depressed humans’ tongue segmentation. Displays 76:102371
BioHit (2014) Tongeimagedataset. GitHub
Shi D, Tang C, Blackley SV, Wang L, Yang J, He Y, Bennett SI, Xiong Y, Shi X, Zhou L et al (2020) An annotated dataset of tongue images supporting geriatric disease diagnosis. Data Brief 32:106153
Tang C (2019) Replication data for: an annotated dataset of tongue images
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv:1412.7062
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Cai Y, Wang T, Liu W, Luo Z (2020) A robust interclass and intraclass loss function for deep learning based tongue segmentation. Concurr Comput Pract Exp 32(22):5849
O Pinheiro PO, Collobert R, Dollár P (2015) Learning to segment object candidates. Advances in neural information processing systems 28
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) ENet: a deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV). pp 552–568
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3431–3440
Guo J, Yang Y, Wu Q, Su J, Ma F (2016) Adaptive active contour model based automatic tongue image segmentation. In: 2016 9th International congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI). IEEE, pp 1386–1390
Li X, Yang T, Hu Y, Xu M, Zhang W, Li F (2017) Automatic tongue image matting for remote medical diagnosis. In: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, pp 561–564
Liu W, Zhou C, Li Z, Hu Z (2020) Patch-driven tongue image segmentation using sparse representation. IEEE Access 8:41372–41383
Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Acknowledgements
This research project is supported by Mahidol University (Basic Research Fund: fiscal year 2022) (FRB650007/0185) (Contract No BRF1-056/2565).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kusakunniran, W., Imaromkul, T., Mongkolluksamee, S. et al. Deep Upscale U-Net for automatic tongue segmentation. Med Biol Eng Comput 62, 1751–1762 (2024). https://doi.org/10.1007/s11517-024-03051-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-024-03051-w