TCNet: tensor and covariance attention network for semantic segmentation

Xu, Haixia; Liu, Yanbang; Wang, Wei; Zhou, Wei; Ding, Fanxun; Han, Feng; Peng, Wei

doi:10.1007/s00500-024-09638-7

TCNet: tensor and covariance attention network for semantic segmentation

Neural Networks
Published: 06 February 2024

(2024)
Cite this article

Soft Computing Aims and scope Submit manuscript

Haixia Xu ORCID: orcid.org/0000-0001-8587-7044^1,2,
Yanbang Liu^1,2,
Wei Wang^1,2,
Wei Zhou^1,2,
Fanxun Ding^1,2,
Feng Han^1,2 &
…
Wei Peng^1,2

140 Accesses
Explore all metrics

Abstract

Non-local network provides a pioneering approach for capturing long-range dependency by aggregating query-specific global context into each query location; however, non-local network applies the identical weight to each channel of feature maps and ignores the differences from the different channels of features. We design a novel tensor attention module (TAM), which integrates the context information along spatial dimension and channel dimension by introducing a bias learnable parameters tensor, so that the feature at each location of each channel can aggregate the features from all other locations. Motivated by SE-Net, we propose a novel second-order covariance attention module (SCAM) to enhance the feature correlation between different channel maps through the second-order statistics and the local cross-channel interaction strategy. We take the encoder–decoder segmentation network DeepLabv3+ as baseline, and in the encoder develop the attention modules TAM and SCAM for semantic segmentation (TCNet). Experimental results on PASCAL VOC 2012 and Cityscapes datasets show that our proposed network has better performance than the other state-of-the-art segmentation networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tensor Low-Rank Reconstruction for Semantic Segmentation

Enhanced multi-scale networks for semantic segmentation

Article Open access 04 December 2023

An Attention Enhanced Graph Convolutional Network for Semantic Segmentation

Data availability

The data underlying this article are available in VOC2012 Benchmark http://host.robots.ox.ac.uk/pascal/VOC, and in Cityscapes Benchmark https://www.cityscapes-dataset.com.

References

Cao Y, Xu J, Lin S et al (2019) Gcnet: non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF international conference on computer vision workshops
Chen LC, Papandreou G, Kokkinos I et al (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv preprint arXiv:1412.7062
Chen LC, Papandreou G, Kokkinos I et al (2017a) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Chen LC, Papandreou G, Schroff F et al (2017b) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Chen LC, Zhu Y, Papandreou G et al (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818
Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. arXiv preprint arXiv:1601.06733
Cordts M, Omran M, Ramos S et al (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Deng J, Dong W, Socher R et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255
Ding H, Jiang X, Shuai B et al (2018) Context contrasted feature and gated multi-scale aggregation for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2393–2402
Everingham M, Van Gool L, Williams CKI et al (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Fan M, Lai S, Huang J et al (2021) Rethinking bisenet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9716–9725
Fu J, Liu J, Tian H et al (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154
Gao Y, Beijbom O, Zhang N et al (2016) Compact bilinear pooling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 317–326
Ghiasi G, Fowlkes CC (2016) Laplacian pyramid reconstruction and refinement for semantic segmentation. In: European conference on computer vision. Springer, Cham, pp 519–534
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Huang G, Liu Z, Van Der Maaten L et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Huang Z, Wang X, Huang L et al (2019) Ccnet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 603–612
Krešo I, Čaučević D, Krapac J et al (2016) Convolutional scale invariance for semantic segmentation. In: German conference on pattern recognition. Springer, Cham, pp 64–75
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Li P, Xie J, Wang Q et al (2017)Is second-order information helpful for large-scale visual recognition?. In: Proceedings of the IEEE international conference on computer vision, pp 2070–2078
Li P, Xie J, Wang Q et al (2018) Towards faster training of global covariance pooling networks by iterative matrix square root normalization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 947–955
Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400
Lin G, Shen C, Van Den Hengel A et al (2016) Efficient piecewise training of deep structured models for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3194–3203
Lin G, Milan A, Shen C et al (2017) Refinenet: multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1925–1934
Liu Z, Li X, Luo P et al (2015) Semantic image segmentation via deep parsing network. In: Proceedings of the IEEE international conference on computer vision, pp 1377–1385
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Noh H, Hong S, Han B (2015)Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520–1528
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, Cham, pp 234–241
Siam M, Elkerdawy S, Jagersand M et al (2017) Deep semantic segmentation for automated driving: taxonomy, roadmap and challenges. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC). IEEE, pp 1–8
Siam M, Gamal M, Abdel-Razek M et al (2018) A comparative study of real-time semantic segmentation for autonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 587–597
Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Adv Neural Inf Process Syst 30:1–11
Google Scholar
Vemulapalli R, Tuzel O, Liu MY et al (2016) Gaussian conditional random field network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3224–3233
Wang X, Girshick R, Gupta A et al (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803
Wang Q, Wu B, Zhu P et al (2019) ECA-Net: efficient channel attention for deep convolutional neural networks. arXiv:1910.03151
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122
Yuan Y, Huang L, Guo J et al (2018) Ocnet: object context network for scene parsing. arXiv preprint arXiv:1809.00916
Zhang H, Dana K, Shi J et al (2018) Context encoding for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7151–7160
Zhao H, Shi J, Qi X et al (2017a) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
Zhao H, Shi J, Qi X et al (2017b) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
Zhao H, Zhang Y, Liu S et al (2018) Psanet: point-wise spatial attention network for scene parsing. In: Proceedings of the European conference on computer vision (ECCV), pp 267–283
Zheng S, Jayasumana S, Romera-Paredes B et al (2015) Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1529–1537
Zhu Z, Xu M, Bai S et al (2019) Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 593–602

Download references

Acknowledgements

This work was supported in part by Key Program Scientific Research Fund of Hunan Provincial Education Department (No. 22A0127, No. 23A0155), and by Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University. (No. 2023ICIP07, No. 2023ICIP03, No. 2022ICIP03), and in part by the Natural Science Foundation of China (No. 62003288).

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, China
Haixia Xu, Yanbang Liu, Wei Wang, Wei Zhou, Fanxun Ding, Feng Han & Wei Peng
School of Automation and Electronic Information, Xiangtan University, Xiangtan, China
Haixia Xu, Yanbang Liu, Wei Wang, Wei Zhou, Fanxun Ding, Feng Han & Wei Peng

Authors

Haixia Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yanbang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Fanxun Ding
View author publications
You can also search for this author in PubMed Google Scholar
Feng Han
View author publications
You can also search for this author in PubMed Google Scholar
Wei Peng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haixia Xu.

Ethics declarations

Conflict of interest

The authors declare that they do not have conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xu, H., Liu, Y., Wang, W. et al. TCNet: tensor and covariance attention network for semantic segmentation. Soft Comput (2024). https://doi.org/10.1007/s00500-024-09638-7

Download citation

Accepted: 02 January 2024
Published: 06 February 2024
DOI: https://doi.org/10.1007/s00500-024-09638-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TCNet: tensor and covariance attention network for semantic segmentation

Abstract

Access this article

Similar content being viewed by others

Tensor Low-Rank Reconstruction for Semantic Segmentation

Enhanced multi-scale networks for semantic segmentation

An Attention Enhanced Graph Convolutional Network for Semantic Segmentation

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TCNet: tensor and covariance attention network for semantic segmentation

Abstract

Access this article

Similar content being viewed by others

Tensor Low-Rank Reconstruction for Semantic Segmentation

Enhanced multi-scale networks for semantic segmentation

An Attention Enhanced Graph Convolutional Network for Semantic Segmentation

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation