Abstract
Purpose
Stereo matching is a crucial technology in the binocular laparoscopic-based surgical navigation systems. In recent years, neural networks have been widely applied to stereo matching and demonstrated outstanding performance. however, this method heavily relies on manual feature engineering meaning that professionals must be involved in the feature extraction and matching. This process is both time-consuming and demands specific expertise.
Methods
This paper introduces a novel stereo matching framework DCStereo that realizes a fully automatic neural architecture design for the stereo matching of binocular laparoscopic images. The proposed framework utilizes a densely connected search space which enables a more flexible and diverse architecture composition. Furthermore, the proposed algorithm leverages the channel and path sampling strategies to reduce memory consumption during searching.
Results
Empirically, our searched DCStereo on the SCARED training dataset achieves a mean absolute error of 3.589 mm on the test dataset, which outperforms hand-crafted stereo matching methods and other approaches. Furthermore, when directly testing on the SERV-CT dataset, our DCStereo demonstrates better generalization ability than other methods.
Conclusion
Our proposed approach leverages the neural architecture search technique and a densely connected search space for automatic neural architecture design in stereo matching of binocular laparoscopic images. Our method delivers advanced performance on the SCARED dataset and promising results on the SERV-CT dataset. These findings demonstrate the potential of our approach for improving clinical surgical navigation systems.
Similar content being viewed by others
References
Li D, Wang M (2022) A 3D image registration method for laparoscopic liver surgery navigation. Electronics 11(11):1670. https://doi.org/10.3390/electronics11111670
Zhang F, Prisacariu V, Yang R, Torr PH (2019) Ga-net: guided aggregation net for end-to-end stereo matching. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 185–194. https://doi.org/10.1109/CVPR.2019.00027
Shen Z, Dai Y, Song X, Rao Z, Zhou D, Zhang L (2022) Pcw-net: pyramid combination and warping cost volume for stereo matching. In: computer vision–ECCV 2022: 17th European conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXII, Springer, pp. 280–297. https://doi.org/10.1007/978-3-031-19824-3_17
Li Z, Liu X, Drenkow N, Ding A, Creighton FX, Taylor RH, Unberath M (2021) Revisiting stereo depth estimation from a sequence-to-sequence perspective with transformers. In: proceedings of the IEEE/CVF international conference on computer vision, pp. 6197–6206. https://doi.org/10.1109/ICCV48922.2021.00614
Xu H, Zhang J (2020) Aanet: adaptive aggregation network for efficient stereo matching. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1959–1968. https://doi.org/10.1109/CVPR42600.2020.00203
Chang J-R, Chen Y-S (2018) Pyramid stereo matching network. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5410–5418. https://doi.org/10.1109/CVPR.2018.00567
Cheng X, Zhong Y, Harandi M, Drummond T, Wang Z, Ge Z (2022) Deep laparoscopic stereo matching with transformers. In: medical image computing and computer assisted intervention–MICCAI 2022: 25th international conference, Singapore, September 18–22, 2022, Proceedings, Part VII, Springer, pp 464–474. https://doi.org/10.1007/978-3-031-16449-1_44
Xia W, Chen EC, Pautler S, Peters TM (2022) A robust edge-preserving stereo matching method for laparoscopic images. IEEE Trans Med Imaging 41(7):1651–1664. https://doi.org/10.1109/TMI.2022.3147414
Luo H, Wang C, Duan X, Liu H, Wang P, Hu Q, Jia F (2022) Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images. Comput Biol Med 140:105109. https://doi.org/10.1016/j.compbiomed.2021.105109
Liu X, Sinha A, Ishii M, Hager GD, Reiter A, Taylor RH, Unberath M (2019) Dense depth estimation in monocular endoscopy with self-supervised learning methods. IEEE Trans Med Imaging 39(5):1438–1447. https://doi.org/10.1109/TMI.2019.2950936
Li F, Li Q, Zhang T, Niu Y, Shi G (2019) Depth acquisition with the combination of structured light and deep learning stereo matching. Signal Process Image Commun 75:111–117. https://doi.org/10.1016/j.image.2019.04.001
Bardozzo F, Collins T, Forgione A, Hostettler A, Tagliaferri R (2022) StaSiS-Net: a stacked and siamese disparity estimation network for depth reconstruction in modern 3D laparoscopy. Med Image Anal 77:102380. https://doi.org/10.1016/j.media.2022.102380
Wei R, Li B, Mo H, Lu B, Long Y, Yang B, Dou Q, Liu Y, Sun D (2021) Stereo dense scene reconstruction and accurate laparoscope localization for learning-based navigation in robot-assisted surgery. arXiv e-prints: arXiv: 2110.03912.
Liu C, Chen L-C, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 82–92. https://doi.org/10.1109/CVPR.2019.00017
Xue Y, Qin J (2022) Partial connection based on channel attention for differentiable neural architecture search. IEEE Trans Industr Inf. https://doi.org/10.1109/TII.2022.3184700
Zhang X, Xu H, Mo H, Tan J, Yang C, Wang L, Ren W (2021) Dcnas: Densely connected neural architecture search for semantic image segmentation. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13956–13967. https://doi.org/10.1109/CVPR46437.2021.01374
Wang H, Wang Y, Sun R, Li B (2022) Global convergence of maml and theory-inspired neural architecture search for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9797–9808. https://doi.org/10.1109/CVPR52688.2022.00957
Ye P, Li B, Li Y, Chen T, Fan J, Ouyang W (2022) b-darts: Beta-decay regularization for differentiable architecture search. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10874–10883. https://doi.org/10.1109/CVPR52688.2022.01060
Zoph B, Le Q (2016) Neural architecture search with reinforcement learning. In: international conference on learning representations
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710. https://doi.org/10.1109/CVPR.2018.00907
Liu H, Simonyan K, Yang Y (2018) DARTS: differentiable architecture search. In: international conference on learning representations
Fang J, Sun Y, Zhang Q, Li Y, Liu W, Wang X (2020) Densely connected search space for more flexible neural architecture search. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10628–10637. pp 10628–10637. https://doi.org/10.1109/CVPR42600.2020.01064
Cheng X, Zhong Y, Harandi M, Dai Y, Chang X, Li H, Drummond T, Ge Z (2020) Hierarchical neural architecture search for deep stereo matching. Adv Neural Inf Process Syst 33:22158–22169
Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A, Bry A (2017) End-to-end learning of geometry and context for deep stereo regression. In: proceedings of the IEEE international conference on computer vision, pp. 66–75. https://doi.org/10.1109/ICCV.2017.17
Xu Y, Xie L, Zhang X, Chen X, Qi G-J, Tian Q, Xiong H (2019) PC-DARTS: partial channel connections for memory-efficient architecture search. In: international conference on learning representations
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: proceedings of COMPSTAT'2010: 19th international conference on computational statistics Paris France, August 22–27, 2010 Keynote, Invited and Contributed Papers, Springer: London. pp. 177–186. https://doi.org/10.1007/978-3-7908-2604-3_16
Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4040–4048. https://doi.org/10.1109/CVPR.2016.438
Allan M, Mcleod J, Wang C, Rosenthal JC, Hu Z, Gard N, Eisert P, Fu KX, Zeffiro T, Xia W (2021) Stereo correspondence and reconstruction of endoscopic data challenge. arXiv preprint arXiv:210101133
Edwards PE, Psychogyios D, Speidel S, Maier-Hein L, Stoyanov D (2022) SERV-CT: a disparity dataset from cone-beam CT for validation of endoscopic 3D reconstruction. Med Image Anal 76:102302. https://doi.org/10.1016/j.media.2021.102302
Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5693–5703. https://doi.org/10.1109/CVPR.2019.00584
Wu J, Sun J, Shen SG, Xu B, Li J, Zhang S (2016) Computer-assisted navigation: its role in intraoperatively accurate mandibular reconstruction. Oral Surg Oral Med Oral Pathol Oral Radiol 122(2):134–142. https://doi.org/10.1016/j.oooo.2016.02.001
Acknowledgements
This research was supported by the National Major Scientific Research Instrument Development Project (Grant No. 81827804) and the Key Research and Development Plan of Zhejiang Province (Grant No. 2022C03086).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent of publication
The data used in this paper are a public dataset and it is anonymous, not used for commercial purposes, but for research purposes only.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jin, Z., Hu, C., Fu, Z. et al. Stereo matching of binocular laparoscopic images with improved densely connected neural architecture search. Int J CARS 19, 677–686 (2024). https://doi.org/10.1007/s11548-023-03035-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11548-023-03035-5