Dense feature pyramid network for cartoon dog parsing,The Visual Computer

当前位置： X-MOL 学术 › Vis. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Dense feature pyramid network for cartoon dog parsing
The Visual Computer ( IF 3.0 ) Pub Date : 2020-07-09 , DOI: 10.1007/s00371-020-01887-5
Jerome Wan , Guillaume Mougeot , Xubo Yang

While traditional cartoon character drawings are simple for humans to create, it remains a highly challenging task for machines to interpret. Parsing is a way to alleviate the issue with fine-grained semantic segmentation of images. Although well studied on naturalistic images, research toward cartoon parsing is very sparse. Due to the lack of available dataset and the diversity of artwork styles, the difficulty of the cartoon character parsing task is greater than the well-known human parsing task. In this paper, we study one type of cartoon instance: cartoon dogs. We introduce a novel dataset toward cartoon dog parsing and create a new deep convolutional neural network (DCNN) to tackle the problem. Our dataset contains 965 precisely annotated cartoon dog images with seven semantic part labels. Our new model, called dense feature pyramid network (DFPnet), makes use of recent popular techniques on semantic segmentation to efficiently handle cartoon dog parsing. We achieve a mIoU of 68.39%, a Mean Accuracy of 79.4% and a Pixel Accuracy of 93.5% on our cartoon dog validation set. Our method outperforms state-of-the-art models of similar tasks trained on our dataset: CE2P for single human parsing and Mask R-CNN for instance segmentation. We hope this work can be used as a starting point for future research toward digital artwork understanding with DCNN. Our DFPnet and dataset will be publicly available.

中文翻译：

用于卡通狗解析的密集特征金字塔网络

虽然传统的卡通人物绘图对人类来说很简单，但对机器进行解释仍然是一项极具挑战性的任务。解析是一种缓解图像细粒度语义分割问题的方法。尽管对自然图像进行了很好的研究，但对卡通解析的研究却非常稀少。由于缺乏可用数据集和艺术风格的多样性，卡通人物解析任务的难度大于众所周知的人类解析任务。在本文中，我们研究了一种卡通实例：卡通狗。我们为卡通狗解析引入了一个新的数据集，并创建了一个新的深度卷积神经网络 (DCNN) 来解决这个问题。我们的数据集包含 965 个带有七个语义部分标签的精确注释的卡通狗图像。我们的新模型，称为密集特征金字塔网络（DFPnet），利用最近流行的语义分割技术来有效地处理卡通狗解析。我们在卡通狗验证集上实现了 68.39% 的 mIoU、79.4% 的平均准确度和 93.5% 的像素准确度。我们的方法优于在我们的数据集上训练的类似任务的最新模型：用于单个人类解析的 CE2P 和用于实例分割的 Mask R-CNN。我们希望这项工作可以作为未来研究使用 DCNN 理解数字艺术作品的起点。我们的 DFPnet 和数据集将公开提供。我们的方法优于在我们的数据集上训练的类似任务的最先进模型：用于单个人类解析的 CE2P 和用于实例分割的 Mask R-CNN。我们希望这项工作可以作为未来研究使用 DCNN 理解数字艺术作品的起点。我们的 DFPnet 和数据集将公开提供。我们的方法优于在我们的数据集上训练的类似任务的最先进模型：用于单个人类解析的 CE2P 和用于实例分割的 Mask R-CNN。我们希望这项工作可以作为未来研究使用 DCNN 理解数字艺术作品的起点。我们的 DFPnet 和数据集将公开提供。

更新日期：2020-07-09

点击分享查看原文

点击收藏

阅读更多本刊最新论文