当前位置: X-MOL 学术Int. J. CARS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Language-based translation and prediction of surgical navigation steps for endoscopic wayfinding assistance in minimally invasive surgery
International Journal of Computer Assisted Radiology and Surgery ( IF 2.3 ) Pub Date : 2020-10-10 , DOI: 10.1007/s11548-020-02264-2
Richard Bieck , Katharina Heuermann , Markus Pirlich , Juliane Neumann , Thomas Neumuth

Purpose

In the context of aviation and automotive navigation technology, assistance functions are associated with predictive planning and wayfinding tasks. In endoscopic minimally invasive surgery, however, assistance so far relies primarily on image-based localization and classification. We show that navigation workflows can be described and used for the prediction of navigation steps.

Methods

A natural description vocabulary for observable anatomical landmarks in endoscopic images was defined to create 3850 navigation workflow sentences from 22 annotated functional endoscopic sinus surgery (FESS) recordings. Resulting FESS navigation workflows showed an imbalanced data distribution with over-represented landmarks in the ethmoidal sinus. A transformer model was trained to predict navigation sentences in sequence-to-sequence tasks. The training was performed with the Adam optimizer and label smoothing in a leave-one-out cross-validation study. The sentences were generated using an adapted beam search algorithm with exponential decay beam rescoring. The transformer model was compared to a standard encoder-decoder-model, as well as HMM and LSTM baseline models.

Results

The transformer model reached the highest prediction accuracy for navigation steps at 0.53, followed by 0.35 of the LSTM and 0.32 for the standard encoder-decoder-network. With an accuracy of sentence generation of 0.83, the prediction of navigation steps at sentence-level benefits from the additional semantic information. While standard class representation predictions suffer from an imbalanced data distribution, the attention mechanism also considered underrepresented classes reasonably well.

Conclusion

We implemented a natural language-based prediction method for sentence-level navigation steps in endoscopic surgery. The sentence-level prediction method showed a potential that word relations to navigation tasks can be learned and used for predicting future steps. Further studies are needed to investigate the functionality of path prediction. The prediction approach is a first step in the field of visuo-linguistic navigation assistance for endoscopic minimally invasive surgery.



中文翻译:

基于语言的翻译和手术导航步骤的预测,用于微创手术中的内窥镜寻路辅助

目的

在航空和汽车导航技术的背景下,辅助功能与预测性计划和寻路任务相关联。然而,在内窥镜微创手术中,到目前为止,协助主要依靠基于图像的定位和分类。我们展示了可以描述导航工作流并将其用于导航步骤的预测。

方法

定义了用于内窥镜图像中可观察到的解剖标志的自然描述词汇表,以从22个带注释的功能性内窥镜鼻窦手术(FESS)记录中创建3850个导航工作流语句。结果,FESS导航工作流显示出筛骨窦中的地标过分代表的数据分布不平衡。训练了一个变压器模型,以预测序列到序列任务中的导航语句。培训是通过亚当优化器和标签平滑在一项留一法交叉验证研究中进行的。使用具有指数衰减波束记录的自适应波束搜索算法生成句子。将该变压器模型与标准编码器-解码器模型以及HMM和LSTM基线模型进行了比较。

结果

对于导航步骤,该变压器模型的预测精度最高,为0.53,其次是LSTM的0.35和标准编码器-解码器网络的0.32。句子生成的精度为0.83,在句子级别的导航步骤的预测受益于附加的语义信息。尽管标准班级代表的预测受到数据分布不平衡的困扰,但注意机制也认为合理地代表了不足的班级。

结论

我们为内窥镜手术中的句子级导航步骤实现了一种基于自然语言的预测方法。句子级别的预测方法显示了一种潜在的可能性,即可以学习与导航任务的单词关系并将其用于预测未来的步骤。需要进一步研究以研究路径预测的功能。预测方法是用于内窥镜微创手术的视觉语言导航辅助领域的第一步。

更新日期:2020-10-11
down
wechat
bug