当前位置: X-MOL 学术Int. J. Doc. Anal. Recognit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generalized framework for summarization of fixed-camera lecture videos by detecting and binarizing handwritten content
International Journal on Document Analysis and Recognition ( IF 2.3 ) Pub Date : 2019-06-15 , DOI: 10.1007/s10032-019-00327-y
Bhargava Urala Kota , Kenny Davila , Alexander Stone , Srirangaraj Setlur , Venu Govindaraju

We propose a framework to extract and binarize handwritten content in lecture videos. The extracted content could potentially be used to index video collections powering content-based search and navigation within lecture videos helping students and educators across the world. A deep learning pipeline is used to detect handwritten text, formulae and sketches and then binarize the extracted content. We exploit the spatio-temporal structure of our binarized detections to compute associativity information of content across all video frames. This information is later used to segment the video. Experiments are conducted to compare the performance of key components of our framework in isolation, as well as the impact on overall performance, with respect to existing methods. We evaluate our framework on the publicly available AccessMath lecture video dataset obtaining an f-measure of \(94.32\%\) for binary connected components. Code for the framework (including trained weights) and summarization will be released.

中文翻译:

通过检测和二值化手写内容来汇总固定摄像机讲座视频的通用框架

我们提出了一个框架来提取和二进制化演讲视频中的手写内容。提取的内容可能会用于索引视频收藏,从而为讲座视频中基于内容的搜索和导航提供支持,从而帮助世界各地的学生和教育工作者。深度学习管道用于检测手写文本,公式和草图,然后对提取的内容进行二值化。我们利用二值化检测的时空结构来计算所有视频帧中内容的关联性信息。此信息以后将用于分割视频。进行实验以比较现有方法相对于框架的关键组件的性能,以及对整体性能的影响。f-二进制连接的组件的\(94.32 \%\)度量。该框架的代码(包括经过训练的权重)和摘要将被发布。
更新日期:2019-06-15
down
wechat
bug