当前位置: X-MOL 学术Med. Image Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
TransVFS: A spatio-temporal local–global transformer for vision-based force sensing during ultrasound-guided prostate biopsy
Medical Image Analysis ( IF 10.9 ) Pub Date : 2024-03-02 , DOI: 10.1016/j.media.2024.103130
Yibo Wang , Zhichao Ye , Mingwei Wen , Huageng Liang , Xuming Zhang

Robot-assisted prostate biopsy is a new technology to diagnose prostate cancer, but its safety is influenced by the inability of robots to sense the tool-tissue interaction force accurately during biopsy. Recently, vision based force sensing (VFS) provides a potential solution to this issue by utilizing image sequences to infer the interaction force. However, the existing mainstream VFS methods cannot realize the accurate force sensing due to the adoption of convolutional or recurrent neural network to learn deformation from the optical images and some of these methods are not efficient especially when the recurrent convolutional operations are involved. This paper has presented a Transformer based VFS (TransVFS) method by leveraging ultrasound volume sequences acquired during prostate biopsy. The TransVFS method uses a spatio-temporal local–global Transformer to capture the local image details and the global dependency simultaneously to learn prostate deformations for force estimation. Distinctively, our method explores both the spatial and temporal attention mechanisms for image feature learning, thereby addressing the influence of the low ultrasound image resolution and the unclear prostate boundary on the accurate force estimation. Meanwhile, the two efficient local–global attention modules are introduced to reduce 4D spatio-temporal computation burden by utilizing the factorized spatio-temporal processing strategy, thereby facilitating the fast force estimation. Experiments on prostate phantom and beagle dogs show that our method significantly outperforms existing VFS methods and other spatio-temporal Transformer models. The TransVFS method surpasses the most competitive compared method ResNet3dGRU by providing the mean absolute errors of force estimation, i.e., 70.4 ± 60.0 millinewton (mN) vs 123.7 ± 95.6 mN, on the transabdominal ultrasound dataset of dogs.

中文翻译:

TransVFS:超声引导前列腺活检期间基于视觉的力传感的时空局部-全局转换器

机器人辅助前列腺活检是诊断前列腺癌的新技术,但其安全性受到机器人在活检过程中无法准确感知工具与组织相互作用力的影响。最近,基于视觉的力传感(VFS)通过利用图像序列来推断相互作用力,为这个问题提供了一个潜在的解决方案。然而,现有的主流VFS方法由于采用卷积或循环神经网络从光学图像中学习变形而无法实现精确的力传感,并且其中一些方法效率不高,尤其是在涉及循环卷积运算时。本文提出了一种基于 Transformer 的 VFS (TransVFS) 方法,该方法利用前列腺活检期间获取的超声体积序列。 TransVFS 方法使用时空局部全局变换器来同时捕获局部图像细节和全局依赖性,以学习前列腺变形以进行力估计。独特的是,我们的方法探索了图像特征学习的空间和时间注意机制,从而解决了低超声图像分辨率和不清晰的前列腺边界对精确力估计的影响。同时,引入两个高效的局部全局注意力模块,利用分解时空处理策略来减少4D时空计算负担,从而促进快速力估计。对前列腺幻影和比格犬的实验表明,我们的方法明显优于现有的 VFS 方法和其他时空 Transformer 模型。 TransVFS 方法在狗的经腹部超声数据集上提供力估计的平均绝对误差,即 70.4 ± 60.0 毫牛 (mN) 与 123.7 ± 95.6 mN,从而超越了最具竞争力的比较方法 ResNet3dGRU。
更新日期:2024-03-02
down
wechat
bug