当前位置: X-MOL 学术Signal Process. Image Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generating visual story graphs with application to photo album summarization
Signal Processing: Image Communication ( IF 3.4 ) Pub Date : 2020-10-16 , DOI: 10.1016/j.image.2020.116033
Bora Celikkale , Goksu Erdogan , Aykut Erdem , Erkut Erdem

Making sense of ever-growing amount of visual data available on the web is difficult, especially when considered in an unsupervised manner. As a step towards this goal, this study tackles a relatively less explored topic of generating structured summaries of large photo collections. Our framework relies on the notion of a story graph which captures the main narratives in the data and their relationships based on their visual, textual and spatio-temporal features. Its output is a directed graph with a set of possibly intersecting paths. Our proposed approach identifies coherent visual storylines and exploits sub-modularity to select a subset of these lines which covers the general narrative at most. Our experimental analysis reveals that extracted story graphs allow for obtaining better results when utilized as priors for photo album summarization. Moreover, our user studies show that our approach delivers better performance on next image prediction and coverage tasks than the state-of-the-art.



中文翻译:

生成视觉故事图并将其应用于相册摘要

很难理解网络上不断增长的可视数据量,尤其是在以无人监督的方式考虑时。作为朝着这个目标迈进的一步,本研究解决了一个相对较少探索的主题,即生成大型照片集的结构化摘要。我们的框架依赖故事图的概念,故事图根据其视觉,文本和时空特征捕获数据中的主要叙述及其关系。它的输出是有向图,带有一组可能相交的路径。我们提出的方法可以识别连贯的视觉故事情节,并利用子模块化来选择这些情节的子集,这些子集最多涵盖了一般叙述。我们的实验分析表明,提取的故事图在用作相册摘要的先验条件时可以获得更好的结果。

更新日期:2020-10-30
down
wechat
bug