当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SlideImages: A Dataset for Educational Image Classification
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-01-19 , DOI: arxiv-2001.06823
David Morris, Eric M\"uller-Budack, Ralph Ewerth

In the past few years, convolutional neural networks (CNNs) have achieved impressive results in computer vision tasks, which however mainly focus on photos with natural scene content. Besides, non-sensor derived images such as illustrations, data visualizations, figures, etc. are typically used to convey complex information or to explore large datasets. However, this kind of images has received little attention in computer vision. CNNs and similar techniques use large volumes of training data. Currently, many document analysis systems are trained in part on scene images due to the lack of large datasets of educational image data. In this paper, we address this issue and present SlideImages, a dataset for the task of classifying educational illustrations. SlideImages contains training data collected from various sources, e.g., Wikimedia Commons and the AI2D dataset, and test data collected from educational slides. We have reserved all the actual educational images as a test dataset in order to ensure that the approaches using this dataset generalize well to new educational images, and potentially other domains. Furthermore, we present a baseline system using a standard deep neural architecture and discuss dealing with the challenge of limited training data.

中文翻译:

SlideImages:用于教育图像分类的数据集

在过去几年中,卷积神经网络 (CNN) 在计算机视觉任务中取得了令人瞩目的成果,但主要集中在具有自然场景内容的照片上。此外,非传感器派生图像,例如插图、数据可视化、图形等,通常用于传达复杂信息或探索大型数据集。然而,这种图像在计算机视觉中很少受到关注。CNN 和类似技术使用大量训练数据。目前,由于缺乏教育图像数据的大型数据集,许多文档分析系统部分是在场景图像上训练的。在本文中,我们解决了这个问题并提出了 SlideImages,这是一个用于对教育插图进行分类的任务的数据集。SlideImages 包含从各种来源收集的训练数据,例如,Wikimedia Commons 和 AI2D 数据集,以及从教育幻灯片中收集的测试数据。我们保留了所有实际的教育图像作为测试数据集,以确保使用该数据集的方法可以很好地推广到新的教育图像和潜在的其他领域。此外,我们提出了一个使用标准深度神经架构的基线系统,并讨论了如何应对有限训练数据的挑战。
更新日期:2020-01-22
down
wechat
bug