Unsupervised binocular depth prediction network for laparoscopic surgery.,Computer Assisted Surgery

当前位置： X-MOL 学术 › Comput. Assist. Surg. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Unsupervised binocular depth prediction network for laparoscopic surgery.
Computer Assisted Surgery ( IF 2.1 ) Pub Date : 2019-05-31 , DOI: 10.1080/24699322.2018.1560082
Ke Xu _{1,

2} , Zhiyong Chen ₁ , Fucang Jia _{2,

3}

Affiliation

Minimally invasive surgery (MIS) is characterized by less trauma, shorter recovery time, and lower postoperative infection rate. The two-dimensional (2D) laparoscopic imaging lacks depth perception and does not provide quantitative depth information, thereby limiting precise and complex surgical operations. Three-dimensional (3D) laparoscopic imaging provides surgeons depth perception. This study aims to 3D reconstruction of the surgical scene based on the disparity map generated by the depth estimation algorithm. An unsupervised learning autoencoder method was proposed to calculate the accurate disparity with a 101-layer residual convolutional network. The loss function included three parts: left-right consistency loss, structure similarity loss, and reconstruction error loss, the combination can improve reconstruction accuracy and robustness. The method was validated on a Hamlyn Center Laparoscopic/Endoscopic Video Dataset. The structural similarity index (SSIM) is 0.8349 ± 0.0523 and the peak signal-to-noise ratio (PSNR) is 14.4957 ± 1.9676. The depth prediction network has high accuracy and robustness. The average time to produce each disparity map is about 16 ms. The experimental result shows that the proposed depth estimation method can offer dense disparity map, and can meet surgical real-time requirement. Future work will focus on network structure optimization and loss function design, transfer learning to improve the robustness and accuracy further.

中文翻译：

腹腔镜手术的无监督双目深度预测网络。

微创手术（MIS）的特点是创伤小，恢复时间短，术后感染率低。二维（2D）腹腔镜成像缺乏深度感知能力，并且无法提供定量的深度信息，从而限制了精确而复杂的手术操作。三维（3D）腹腔镜成像可为外科医生提供深度感知。这项研究旨在基于深度估计算法生成的视差图对手术场景进行3D重建。提出了一种无监督学习自动编码器方法，以计算具有101层残差卷积网络的准确视差。损失函数包括左右一致性损失，结构相似性损失和重建误差损失三个部分，两者结合可以提高重建的准确性和鲁棒性。该方法已在Hamlyn中心腹腔镜/内窥镜视频数据集上得到验证。结构相似性指数（SSIM）为0.8349±0.0523，峰值信噪比（PSNR）为14.4957±1.9676。深度预测网络具有高精度和鲁棒性。产生每个视差图的平均时间约为16毫秒。实验结果表明，所提出的深度估计方法可以提供密集的视差图，并且可以满足手术实时性要求。未来的工作将集中在网络结构优化和损失函数设计，转移学习以进一步提高鲁棒性和准确性上。深度预测网络具有高精度和鲁棒性。产生每个视差图的平均时间约为16毫秒。实验结果表明，所提出的深度估计方法可以提供密集的视差图，并且可以满足手术实时性要求。未来的工作将集中在网络结构优化和损失函数设计，转移学习以进一步提高鲁棒性和准确性上。深度预测网络具有高精度和鲁棒性。产生每个视差图的平均时间约为16毫秒。实验结果表明，所提出的深度估计方法可以提供密集的视差图，并且可以满足手术实时性要求。未来的工作将集中在网络结构优化和损失函数设计，转移学习以进一步提高鲁棒性和准确性上。

更新日期：2019-11-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>