当前位置: X-MOL 学术IEEE Multimed. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Building a Manga Dataset “Manga109” With Annotations for Multimedia Applications
IEEE Multimedia ( IF 3.2 ) Pub Date : 2020-04-16 , DOI: 10.1109/mmul.2020.2987895
Kiyoharu Aizawa 1 , Azuma Fujimoto 1 , Atsushi Otsubo 1 , Toru Ogawa 1 , Yusuke Matsui 1 , Koki Tsubota 1 , Hikaru Ikuta 1
Affiliation  

Manga, or comics, which are a type of multimodal artwork, have been left behind in the recent trend of deep learning applications because of the lack of a proper dataset. Hence, we built Manga109, a dataset consisting of a variety of 109 Japanese comic books (94 authors and 21 142 pages) and made it publicly available by obtaining author permissions for academic use. We carefully annotated the frames, speech texts, character faces, and character bodies; the total number of annotations exceeds 500 k. This dataset provides numerous manga images and annotations, which will be beneficial for use in machine learning algorithms and their evaluation. In addition to academic use, we obtained further permission for a subset of the dataset for industrial use. In this article, we describe the details of the dataset and present a few examples of multimedia processing applications (detection, retrieval, and generation) that apply existing deep learning methods and are made possible by the dataset.

中文翻译:

构建带有用于多媒体应用程序的批注的Manga数据集“ Manga109”

漫画(即漫画)是一种多模式艺术品,由于缺乏适当的数据集,在最近的深度学习应用趋势中已被抛在后面。因此,我们建立了Manga109,这是一个由109种日本漫画书(94位作者和21 142页)组成的数据集,并通过获取作者的学术使用权限使其公开可用。我们仔细地注释了框架,语音文本,人物面孔和人物身体。注释的总数超过500 k。该数据集提供了大量的漫画图像和注释,这将有利于在机器学习算法及其评估中使用。除学术用途外,我们还获得了工业用途数据集子集的进一步许可。在这篇文章中,
更新日期:2020-04-16
down
wechat
bug