当前位置: X-MOL 学术IEEE Trans. Med. Imaging › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MSDESIS: Multitask Stereo Disparity Estimation and Surgical Instrument Segmentation
IEEE Transactions on Medical Imaging ( IF 10.6 ) Pub Date : 2022-06-08 , DOI: 10.1109/tmi.2022.3181229
Dimitrios Psychogyios 1 , Evangelos Mazomenos 1 , Francisco Vasconcelos , Danail Stoyanov 1
Affiliation  

Reconstructing the 3D geometry of the surgical site and detecting instruments within it are important tasks for surgical navigation systems and robotic surgery automation. Traditional approaches treat each problem in isolation and do not account for the intrinsic relationship between segmentation and stereo matching. In this paper, we present a learning-based framework that jointly estimates disparity and binary tool segmentation masks. The core component of our architecture is a shared feature encoder which allows strong interaction between the aforementioned tasks. Experimentally, we train two variants of our network with different capacities and explore different training schemes including both multi-task and single-task learning. Our results show that supervising the segmentation task improves our network’s disparity estimation accuracy. We demonstrate a domain adaptation scheme where we supervise the segmentation task with monocular data and achieve domain adaptation of the adjacent disparity task, reducing disparity End-Point-Error and depth mean absolute error by 77.73% and 61.73% respectively compared to the pre-trained baseline model. Our best overall multi-task model, trained with both disparity and segmentation data in subsequent phases, achieves 89.15% mean Intersection-over-Union in RIS and 3.18 millimetre depth mean absolute error in SCARED test sets. Our proposed multi-task architecture is real-time, able to process ( $1280\times 1024$ ) stereo input and simultaneously estimate disparity maps and segmentation masks at 22 frames per second. The model code and pre-trained models are made available: https://github.com/dimitrisPs/msdesis

中文翻译:

MSDESIS:多任务立体视差估计和手术器械分割

重建手术部位的 3D 几何结构和其中的检测仪器是手术导航系统和机器人手术自动化的重要任务。传统方法孤立地处理每个问题,不考虑分割和立体匹配之间的内在关系。在本文中,我们提出了一个基于学习的框架,该框架联合估计视差和二进制工具分割掩码。我们架构的核心组件是一个共享特征编码器,它允许上述任务之间进行强交互。在实验上,我们训练了具有不同能力的网络的两个变体,并探索了不同的训练方案,包括多任务和单任务学习。我们的结果表明,监督分割任务提高了我们网络的视差估计精度。我们展示了一种领域适应方案,我们用单眼数据监督分割任务并实现相邻视差任务的领域适应,与预训练相比,视差端点误差和深度平均绝对误差分别降低了 77.73% 和 61.73%基线模型。我们最好的整体多任务模型,在后续阶段使用视差和分割数据进行训练,在 RIS 中实现了 89.15% 的平均交并比,在 SCARED 测试集中实现了 3.18 毫米的深度平均绝对误差。我们提出的多任务架构是实时的,能够处理(与预训练的基线模型相比分别为 73%。我们最好的整体多任务模型,在后续阶段使用视差和分割数据进行训练,在 RIS 中实现了 89.15% 的平均交并比,在 SCARED 测试集中实现了 3.18 毫米的深度平均绝对误差。我们提出的多任务架构是实时的,能够处理(与预训练的基线模型相比分别为 73%。我们最好的整体多任务模型,在后续阶段使用视差和分割数据进行训练,在 RIS 中实现了 89.15% 的平均交并比,在 SCARED 测试集中实现了 3.18 毫米的深度平均绝对误差。我们提出的多任务架构是实时的,能够处理( $1280\乘以 1024$ ) 立体输入并同时以每秒 22 帧的速度估计视差图和分割掩码。模型代码和预训练模型可用:https://github.com/dimitrisPs/msdesis
更新日期:2022-06-08
down
wechat
bug