Stereo matching of binocular laparoscopic images with improved densely connected neural architecture search

Jin, Ziyi; Hu, Chunyong; Fu, Zuoming; Zhang, Chongan; Wang, Peng; Zhang, Hong; Ye, Xuesong

doi:10.1007/s11548-023-03035-5

Stereo matching of binocular laparoscopic images with improved densely connected neural architecture search

Original Article
Published: 30 January 2024

Volume 19, pages 677–686, (2024)
Cite this article

International Journal of Computer Assisted Radiology and Surgery Aims and scope Submit manuscript

Ziyi Jin¹,
Chunyong Hu¹,
Zuoming Fu¹,
Chongan Zhang¹,
Peng Wang²,
Hong Zhang¹ &
…
Xuesong Ye ORCID: orcid.org/0000-0002-3439-3733¹

182 Accesses
Explore all metrics

Abstract

Purpose

Stereo matching is a crucial technology in the binocular laparoscopic-based surgical navigation systems. In recent years, neural networks have been widely applied to stereo matching and demonstrated outstanding performance. however, this method heavily relies on manual feature engineering meaning that professionals must be involved in the feature extraction and matching. This process is both time-consuming and demands specific expertise.

Methods

This paper introduces a novel stereo matching framework DCStereo that realizes a fully automatic neural architecture design for the stereo matching of binocular laparoscopic images. The proposed framework utilizes a densely connected search space which enables a more flexible and diverse architecture composition. Furthermore, the proposed algorithm leverages the channel and path sampling strategies to reduce memory consumption during searching.

Results

Empirically, our searched DCStereo on the SCARED training dataset achieves a mean absolute error of 3.589 mm on the test dataset, which outperforms hand-crafted stereo matching methods and other approaches. Furthermore, when directly testing on the SERV-CT dataset, our DCStereo demonstrates better generalization ability than other methods.

Conclusion

Our proposed approach leverages the neural architecture search technique and a densely connected search space for automatic neural architecture design in stereo matching of binocular laparoscopic images. Our method delivers advanced performance on the SCARED dataset and promising results on the SERV-CT dataset. These findings demonstrate the potential of our approach for improving clinical surgical navigation systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stereo attention-based all-in-one super-resolution for robot-assisted minimally invasive surgery

Article 17 January 2024

Deep Laparoscopic Stereo Matching with Transformers

Disparity-constrained stereo endoscopic image super-resolution

Article 04 April 2022

References

Li D, Wang M (2022) A 3D image registration method for laparoscopic liver surgery navigation. Electronics 11(11):1670. https://doi.org/10.3390/electronics11111670
Article CAS Google Scholar
Zhang F, Prisacariu V, Yang R, Torr PH (2019) Ga-net: guided aggregation net for end-to-end stereo matching. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 185–194. https://doi.org/10.1109/CVPR.2019.00027
Shen Z, Dai Y, Song X, Rao Z, Zhou D, Zhang L (2022) Pcw-net: pyramid combination and warping cost volume for stereo matching. In: computer vision–ECCV 2022: 17th European conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXII, Springer, pp. 280–297. https://doi.org/10.1007/978-3-031-19824-3_17
Li Z, Liu X, Drenkow N, Ding A, Creighton FX, Taylor RH, Unberath M (2021) Revisiting stereo depth estimation from a sequence-to-sequence perspective with transformers. In: proceedings of the IEEE/CVF international conference on computer vision, pp. 6197–6206. https://doi.org/10.1109/ICCV48922.2021.00614
Xu H, Zhang J (2020) Aanet: adaptive aggregation network for efficient stereo matching. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1959–1968. https://doi.org/10.1109/CVPR42600.2020.00203
Chang J-R, Chen Y-S (2018) Pyramid stereo matching network. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5410–5418. https://doi.org/10.1109/CVPR.2018.00567
Cheng X, Zhong Y, Harandi M, Drummond T, Wang Z, Ge Z (2022) Deep laparoscopic stereo matching with transformers. In: medical image computing and computer assisted intervention–MICCAI 2022: 25th international conference, Singapore, September 18–22, 2022, Proceedings, Part VII, Springer, pp 464–474. https://doi.org/10.1007/978-3-031-16449-1_44
Xia W, Chen EC, Pautler S, Peters TM (2022) A robust edge-preserving stereo matching method for laparoscopic images. IEEE Trans Med Imaging 41(7):1651–1664. https://doi.org/10.1109/TMI.2022.3147414
Article PubMed Google Scholar
Luo H, Wang C, Duan X, Liu H, Wang P, Hu Q, Jia F (2022) Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images. Comput Biol Med 140:105109. https://doi.org/10.1016/j.compbiomed.2021.105109
Article PubMed Google Scholar
Liu X, Sinha A, Ishii M, Hager GD, Reiter A, Taylor RH, Unberath M (2019) Dense depth estimation in monocular endoscopy with self-supervised learning methods. IEEE Trans Med Imaging 39(5):1438–1447. https://doi.org/10.1109/TMI.2019.2950936
Article PubMed PubMed Central Google Scholar
Li F, Li Q, Zhang T, Niu Y, Shi G (2019) Depth acquisition with the combination of structured light and deep learning stereo matching. Signal Process Image Commun 75:111–117. https://doi.org/10.1016/j.image.2019.04.001
Article Google Scholar
Bardozzo F, Collins T, Forgione A, Hostettler A, Tagliaferri R (2022) StaSiS-Net: a stacked and siamese disparity estimation network for depth reconstruction in modern 3D laparoscopy. Med Image Anal 77:102380. https://doi.org/10.1016/j.media.2022.102380
Article PubMed Google Scholar
Wei R, Li B, Mo H, Lu B, Long Y, Yang B, Dou Q, Liu Y, Sun D (2021) Stereo dense scene reconstruction and accurate laparoscope localization for learning-based navigation in robot-assisted surgery. arXiv e-prints: arXiv: 2110.03912.
Liu C, Chen L-C, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 82–92. https://doi.org/10.1109/CVPR.2019.00017
Xue Y, Qin J (2022) Partial connection based on channel attention for differentiable neural architecture search. IEEE Trans Industr Inf. https://doi.org/10.1109/TII.2022.3184700
Article Google Scholar
Zhang X, Xu H, Mo H, Tan J, Yang C, Wang L, Ren W (2021) Dcnas: Densely connected neural architecture search for semantic image segmentation. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13956–13967. https://doi.org/10.1109/CVPR46437.2021.01374
Wang H, Wang Y, Sun R, Li B (2022) Global convergence of maml and theory-inspired neural architecture search for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9797–9808. https://doi.org/10.1109/CVPR52688.2022.00957
Ye P, Li B, Li Y, Chen T, Fan J, Ouyang W (2022) b-darts: Beta-decay regularization for differentiable architecture search. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10874–10883. https://doi.org/10.1109/CVPR52688.2022.01060
Zoph B, Le Q (2016) Neural architecture search with reinforcement learning. In: international conference on learning representations
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710. https://doi.org/10.1109/CVPR.2018.00907
Liu H, Simonyan K, Yang Y (2018) DARTS: differentiable architecture search. In: international conference on learning representations
Fang J, Sun Y, Zhang Q, Li Y, Liu W, Wang X (2020) Densely connected search space for more flexible neural architecture search. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10628–10637. pp 10628–10637. https://doi.org/10.1109/CVPR42600.2020.01064
Cheng X, Zhong Y, Harandi M, Dai Y, Chang X, Li H, Drummond T, Ge Z (2020) Hierarchical neural architecture search for deep stereo matching. Adv Neural Inf Process Syst 33:22158–22169
Google Scholar
Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A, Bry A (2017) End-to-end learning of geometry and context for deep stereo regression. In: proceedings of the IEEE international conference on computer vision, pp. 66–75. https://doi.org/10.1109/ICCV.2017.17
Xu Y, Xie L, Zhang X, Chen X, Qi G-J, Tian Q, Xiong H (2019) PC-DARTS: partial channel connections for memory-efficient architecture search. In: international conference on learning representations
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: proceedings of COMPSTAT'2010: 19th international conference on computational statistics Paris France, August 22–27, 2010 Keynote, Invited and Contributed Papers, Springer: London. pp. 177–186. https://doi.org/10.1007/978-3-7908-2604-3_16
Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4040–4048. https://doi.org/10.1109/CVPR.2016.438
Allan M, Mcleod J, Wang C, Rosenthal JC, Hu Z, Gard N, Eisert P, Fu KX, Zeffiro T, Xia W (2021) Stereo correspondence and reconstruction of endoscopic data challenge. arXiv preprint arXiv:210101133
Edwards PE, Psychogyios D, Speidel S, Maier-Hein L, Stoyanov D (2022) SERV-CT: a disparity dataset from cone-beam CT for validation of endoscopic 3D reconstruction. Med Image Anal 76:102302. https://doi.org/10.1016/j.media.2021.102302
Article PubMed PubMed Central Google Scholar
Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5693–5703. https://doi.org/10.1109/CVPR.2019.00584
Wu J, Sun J, Shen SG, Xu B, Li J, Zhang S (2016) Computer-assisted navigation: its role in intraoperatively accurate mandibular reconstruction. Oral Surg Oral Med Oral Pathol Oral Radiol 122(2):134–142. https://doi.org/10.1016/j.oooo.2016.02.001
Article PubMed Google Scholar

Download references

Acknowledgements

This research was supported by the National Major Scientific Research Instrument Development Project (Grant No. 81827804) and the Key Research and Development Plan of Zhejiang Province (Grant No. 2022C03086).

Author information

Authors and Affiliations

Biosensor National Special Laboratory, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, 310027, China
Ziyi Jin, Chunyong Hu, Zuoming Fu, Chongan Zhang, Hong Zhang & Xuesong Ye
Hangzhou XianAo Technology Inc, Hangzhou, 311121, China
Peng Wang

Authors

Ziyi Jin
View author publications
You can also search for this author in PubMed Google Scholar
Chunyong Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zuoming Fu
View author publications
You can also search for this author in PubMed Google Scholar
Chongan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xuesong Ye
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuesong Ye.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent of publication

The data used in this paper are a public dataset and it is anonymous, not used for commercial purposes, but for research purposes only.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jin, Z., Hu, C., Fu, Z. et al. Stereo matching of binocular laparoscopic images with improved densely connected neural architecture search. Int J CARS 19, 677–686 (2024). https://doi.org/10.1007/s11548-023-03035-5

Download citation

Received: 24 February 2023
Accepted: 02 November 2023
Published: 30 January 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11548-023-03035-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stereo matching of binocular laparoscopic images with improved densely connected neural architecture search