当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning Compositional Neural Information Fusion for Human Parsing
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-01-19 , DOI: arxiv-2001.06804
Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, and Ling Shao

This work proposes to combine neural networks with the compositional hierarchy of human bodies for efficient and complete human parsing. We formulate the approach as a neural information fusion framework. Our model assembles the information from three inference processes over the hierarchy: direct inference (directly predicting each part of a human body using image information), bottom-up inference (assembling knowledge from constituent parts), and top-down inference (leveraging context from parent nodes). The bottom-up and top-down inferences explicitly model the compositional and decompositional relations in human bodies, respectively. In addition, the fusion of multi-source information is conditioned on the inputs, i.e., by estimating and considering the confidence of the sources. The whole model is end-to-end differentiable, explicitly modeling information flows and structures. Our approach is extensively evaluated on four popular datasets, outperforming the state-of-the-arts in all cases, with a fast processing speed of 23fps. Our code and results have been released to help ease future research in this direction.

中文翻译:

用于人体解析的学习组合神经信息融合

这项工作建议将神经网络与人体的组成层次结构相结合,以进行有效和完整的人体解析。我们将该方法制定为神经信息融合框架。我们的模型在层次结构上组合了来自三个推理过程的信息:直接推理(使用图像信息直接预测人体的每个部分)、自下而上的推理(从组成部分组装知识)和自上而下的推理(利用来自父节点)。自下而上和自上而下的推理分别明确地模拟人体的组成和分解关系。此外,多源信息的融合以输入为条件,即通过估计和考虑源的置信度。整个模型是端到端可微的,显式建模信息流和结构。我们的方法在四个流行的数据集上得到了广泛的评估,在所有情况下都优于最先进的技术,处理速度为 23fps。我们的代码和结果已经发布,以帮助简化未来在这个方向上的研究。
更新日期:2020-01-22
down
wechat
bug