Skip to main content

Advertisement

Log in

Stereo matching of binocular laparoscopic images with improved densely connected neural architecture search

  • Original Article
  • Published:
International Journal of Computer Assisted Radiology and Surgery Aims and scope Submit manuscript

Abstract

Purpose

Stereo matching is a crucial technology in the binocular laparoscopic-based surgical navigation systems. In recent years, neural networks have been widely applied to stereo matching and demonstrated outstanding performance. however, this method heavily relies on manual feature engineering meaning that professionals must be involved in the feature extraction and matching. This process is both time-consuming and demands specific expertise.

Methods

This paper introduces a novel stereo matching framework DCStereo that realizes a fully automatic neural architecture design for the stereo matching of binocular laparoscopic images. The proposed framework utilizes a densely connected search space which enables a more flexible and diverse architecture composition. Furthermore, the proposed algorithm leverages the channel and path sampling strategies to reduce memory consumption during searching.

Results

Empirically, our searched DCStereo on the SCARED training dataset achieves a mean absolute error of 3.589 mm on the test dataset, which outperforms hand-crafted stereo matching methods and other approaches. Furthermore, when directly testing on the SERV-CT dataset, our DCStereo demonstrates better generalization ability than other methods.

Conclusion

Our proposed approach leverages the neural architecture search technique and a densely connected search space for automatic neural architecture design in stereo matching of binocular laparoscopic images. Our method delivers advanced performance on the SCARED dataset and promising results on the SERV-CT dataset. These findings demonstrate the potential of our approach for improving clinical surgical navigation systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Li D, Wang M (2022) A 3D image registration method for laparoscopic liver surgery navigation. Electronics 11(11):1670. https://doi.org/10.3390/electronics11111670

    Article  CAS  Google Scholar 

  2. Zhang F, Prisacariu V, Yang R, Torr PH (2019) Ga-net: guided aggregation net for end-to-end stereo matching. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 185–194. https://doi.org/10.1109/CVPR.2019.00027

  3. Shen Z, Dai Y, Song X, Rao Z, Zhou D, Zhang L (2022) Pcw-net: pyramid combination and warping cost volume for stereo matching. In: computer vision–ECCV 2022: 17th European conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXII, Springer, pp. 280–297. https://doi.org/10.1007/978-3-031-19824-3_17

  4. Li Z, Liu X, Drenkow N, Ding A, Creighton FX, Taylor RH, Unberath M (2021) Revisiting stereo depth estimation from a sequence-to-sequence perspective with transformers. In: proceedings of the IEEE/CVF international conference on computer vision, pp. 6197–6206. https://doi.org/10.1109/ICCV48922.2021.00614

  5. Xu H, Zhang J (2020) Aanet: adaptive aggregation network for efficient stereo matching. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1959–1968. https://doi.org/10.1109/CVPR42600.2020.00203

  6. Chang J-R, Chen Y-S (2018) Pyramid stereo matching network. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5410–5418. https://doi.org/10.1109/CVPR.2018.00567

  7. Cheng X, Zhong Y, Harandi M, Drummond T, Wang Z, Ge Z (2022) Deep laparoscopic stereo matching with transformers. In: medical image computing and computer assisted intervention–MICCAI 2022: 25th international conference, Singapore, September 18–22, 2022, Proceedings, Part VII, Springer, pp 464–474. https://doi.org/10.1007/978-3-031-16449-1_44

  8. Xia W, Chen EC, Pautler S, Peters TM (2022) A robust edge-preserving stereo matching method for laparoscopic images. IEEE Trans Med Imaging 41(7):1651–1664. https://doi.org/10.1109/TMI.2022.3147414

    Article  PubMed  Google Scholar 

  9. Luo H, Wang C, Duan X, Liu H, Wang P, Hu Q, Jia F (2022) Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images. Comput Biol Med 140:105109. https://doi.org/10.1016/j.compbiomed.2021.105109

    Article  PubMed  Google Scholar 

  10. Liu X, Sinha A, Ishii M, Hager GD, Reiter A, Taylor RH, Unberath M (2019) Dense depth estimation in monocular endoscopy with self-supervised learning methods. IEEE Trans Med Imaging 39(5):1438–1447. https://doi.org/10.1109/TMI.2019.2950936

    Article  PubMed  PubMed Central  Google Scholar 

  11. Li F, Li Q, Zhang T, Niu Y, Shi G (2019) Depth acquisition with the combination of structured light and deep learning stereo matching. Signal Process Image Commun 75:111–117. https://doi.org/10.1016/j.image.2019.04.001

    Article  Google Scholar 

  12. Bardozzo F, Collins T, Forgione A, Hostettler A, Tagliaferri R (2022) StaSiS-Net: a stacked and siamese disparity estimation network for depth reconstruction in modern 3D laparoscopy. Med Image Anal 77:102380. https://doi.org/10.1016/j.media.2022.102380

    Article  PubMed  Google Scholar 

  13. Wei R, Li B, Mo H, Lu B, Long Y, Yang B, Dou Q, Liu Y, Sun D (2021) Stereo dense scene reconstruction and accurate laparoscope localization for learning-based navigation in robot-assisted surgery. arXiv e-prints: arXiv: 2110.03912.

  14. Liu C, Chen L-C, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 82–92. https://doi.org/10.1109/CVPR.2019.00017

  15. Xue Y, Qin J (2022) Partial connection based on channel attention for differentiable neural architecture search. IEEE Trans Industr Inf. https://doi.org/10.1109/TII.2022.3184700

    Article  Google Scholar 

  16. Zhang X, Xu H, Mo H, Tan J, Yang C, Wang L, Ren W (2021) Dcnas: Densely connected neural architecture search for semantic image segmentation. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13956–13967. https://doi.org/10.1109/CVPR46437.2021.01374

  17. Wang H, Wang Y, Sun R, Li B (2022) Global convergence of maml and theory-inspired neural architecture search for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9797–9808. https://doi.org/10.1109/CVPR52688.2022.00957

  18. Ye P, Li B, Li Y, Chen T, Fan J, Ouyang W (2022) b-darts: Beta-decay regularization for differentiable architecture search. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10874–10883. https://doi.org/10.1109/CVPR52688.2022.01060

  19. Zoph B, Le Q (2016) Neural architecture search with reinforcement learning. In: international conference on learning representations

  20. Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710. https://doi.org/10.1109/CVPR.2018.00907

  21. Liu H, Simonyan K, Yang Y (2018) DARTS: differentiable architecture search. In: international conference on learning representations

  22. Fang J, Sun Y, Zhang Q, Li Y, Liu W, Wang X (2020) Densely connected search space for more flexible neural architecture search. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10628–10637. pp 10628–10637. https://doi.org/10.1109/CVPR42600.2020.01064

  23. Cheng X, Zhong Y, Harandi M, Dai Y, Chang X, Li H, Drummond T, Ge Z (2020) Hierarchical neural architecture search for deep stereo matching. Adv Neural Inf Process Syst 33:22158–22169

    Google Scholar 

  24. Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A, Bry A (2017) End-to-end learning of geometry and context for deep stereo regression. In: proceedings of the IEEE international conference on computer vision, pp. 66–75. https://doi.org/10.1109/ICCV.2017.17

  25. Xu Y, Xie L, Zhang X, Chen X, Qi G-J, Tian Q, Xiong H (2019) PC-DARTS: partial channel connections for memory-efficient architecture search. In: international conference on learning representations

  26. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: proceedings of COMPSTAT'2010: 19th international conference on computational statistics Paris France, August 22–27, 2010 Keynote, Invited and Contributed Papers, Springer: London. pp. 177–186. https://doi.org/10.1007/978-3-7908-2604-3_16

  27. Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4040–4048. https://doi.org/10.1109/CVPR.2016.438

  28. Allan M, Mcleod J, Wang C, Rosenthal JC, Hu Z, Gard N, Eisert P, Fu KX, Zeffiro T, Xia W (2021) Stereo correspondence and reconstruction of endoscopic data challenge. arXiv preprint arXiv:210101133

  29. Edwards PE, Psychogyios D, Speidel S, Maier-Hein L, Stoyanov D (2022) SERV-CT: a disparity dataset from cone-beam CT for validation of endoscopic 3D reconstruction. Med Image Anal 76:102302. https://doi.org/10.1016/j.media.2021.102302

    Article  PubMed  PubMed Central  Google Scholar 

  30. Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5693–5703. https://doi.org/10.1109/CVPR.2019.00584

  31. Wu J, Sun J, Shen SG, Xu B, Li J, Zhang S (2016) Computer-assisted navigation: its role in intraoperatively accurate mandibular reconstruction. Oral Surg Oral Med Oral Pathol Oral Radiol 122(2):134–142. https://doi.org/10.1016/j.oooo.2016.02.001

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This research was supported by the National Major Scientific Research Instrument Development Project (Grant No. 81827804) and the Key Research and Development Plan of Zhejiang Province (Grant No. 2022C03086).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuesong Ye.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent of publication

The data used in this paper are a public dataset and it is anonymous, not used for commercial purposes, but for research purposes only.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jin, Z., Hu, C., Fu, Z. et al. Stereo matching of binocular laparoscopic images with improved densely connected neural architecture search. Int J CARS 19, 677–686 (2024). https://doi.org/10.1007/s11548-023-03035-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11548-023-03035-5

Keywords

Navigation