Hard frame detection for the automated clipping of surgical nasal endoscopic video,International Journal of Computer Assisted Radiology and Surgery

当前位置： X-MOL 学术 › Int. J. CARS › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Hard frame detection for the automated clipping of surgical nasal endoscopic video
International Journal of Computer Assisted Radiology and Surgery ( IF 2.3 ) Pub Date : 2021-01-18 , DOI: 10.1007/s11548-021-02311-6
Hongyu Wang , Xiaoying Pan , Hao Zhao , Cong Gao , Ni Liu

Purpose

The automated clipping of surgical nasal endoscopic video is a challenging task because there are many hard frames that have indiscriminative visual features which lead to misclassification. Prior works mainly aim to classify these hard frames along with other frames, and it would seriously affect the performance of classification.

Methods

We propose a hard frame detection method using a convolutional LSTM network (called HFD-ConvLSTM) to remove invalid video frames automatically. Firstly, a new separator based on the coarse-grained classifier is defined to remove the invalid frames. Meanwhile, the hard frames are detected via measuring the blurring score of a video frame. Then, the squeeze-and-excitation is used to select the informative spatial–temporal features of endoscopic videos and further classify the video frames with a fine-grained ConvLSTM learning from the reconstructed training set with hard frames.

Results

We justify the proposed solution through extensive experiments using 12 surgical videos (duration:8501 s). The experiments are performed on both hard frame detection and video frame classification. Nearly 88.3% fuzzy frames can be detected and the classification accuracy is boosted to 95.2%. HFD-ConvLSTM achieves superior performance compared to other methods.

Conclusion

HFD-ConvLSTM provides a new paradigm for video clipping by breaking the complex clipping problem into smaller, more easily managed 2-classification problems. Our investigation reveals that the hard framed detection based on blurring score calculation is effective for nasal endoscopic video clipping.

中文翻译：

硬框检测可自动剪辑手术鼻内窥镜视频

目的

鼻内窥镜手术视频的自动剪辑是一项艰巨的任务，因为存在许多硬框，这些框具有不分明的视觉特征，导致分类错误。先前的工作主要旨在将这些硬框架与其他框架一起分类，这将严重影响分类的性能。

方法

我们提出一种使用卷积LSTM网络（称为HFD-ConvLSTM）的硬帧检测方法，以自动删除无效的视频帧。首先，定义基于粗粒度分类器的新分隔符以删除无效帧。同时，通过测量视频帧的模糊分数来检测硬帧。然后，使用挤压和激励来选择内窥镜视频的信息时空特征，并通过从具有硬帧的重构训练集中进行细粒度的ConvLSTM学习对视频帧进行进一步分类。