当前位置: X-MOL 学术ACM Trans. Multimed. Comput. Commun. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Upgrading the Newsroom
ACM Transactions on Multimedia Computing, Communications, and Applications ( IF 5.2 ) Pub Date : 2020-07-06 , DOI: 10.1145/3396520
Fangyu Liu 1 , Rémi Lebret 2 , Didier Orel 3 , Philippe Sordet 3 , Karl Aberer 2
Affiliation  

We propose an automated image selection system to assist photo editors in selecting suitable images for news articles. The system fuses multiple textual sources extracted from news articles and accepts multilingual inputs. It is equipped with char-level word embeddings to help both modeling morphologically rich languages, e.g., German, and transferring knowledge across nearby languages. The text encoder adopts a hierarchical self-attention mechanism to attend more to both key words within a piece of text and informative components of a news article. We extensively experiment our system on a large-scale text-image database containing multimodal multilingual news articles collected from Swiss local news media websites. The system is compared with multiple baselines with ablation studies and is shown to beat existing text-image retrieval methods in a weakly supervised learning setting. Besides, we also offer insights on the advantage of using multiple textual sources and multilingual data.

中文翻译:

升级新闻编辑室

我们提出了一个自动图像选择系统,以帮助照片编辑为新闻文章选择合适的图像。该系统融合了从新闻文章中提取的多个文本源,并接受多语言输入。它配备了 char 级别的词嵌入,以帮助对形态丰富的语言(例如德语)进行建模,并在附近语言之间传输知识。文本编码器采用分层自注意力机制来更多地关注文本中的关键词和新闻文章的信息成分。我们在一个包含从瑞士当地新闻媒体网站收集的多模式多语言新闻文章的大型文本图像数据库上对我们的系统进行了广泛的试验。该系统与消融研究的多个基线进行了比较,并显示在弱监督学习环境中优于现有的文本图像检索方法。此外,我们还提供了关于使用多个文本源和多语言数据的优势的见解。
更新日期:2020-07-06
down
wechat
bug