当前位置: X-MOL 学术Inf. Process. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MMED: A multi-domain and Multi-modality event dataset
Information Processing & Management ( IF 7.4 ) Pub Date : 2020-06-18 , DOI: 10.1016/j.ipm.2020.102315
Zhenguo Yang , Zehang Lin , Lingni Guo , Qing Li , Wenyin Liu

In this work, we release a multi-domain and multi-modality event dataset (MMED), containing 25,052 textual news articles collected from hundreds of news media sites (e.g., Yahoo News, BBC News, etc.) and 75,884 image posts shared on Flickr by thousands of social media users. The articles contributed by professional journalists and the images shared by amateur users are annotated according to 410 real-world events, covering emergencies, natural disasters, sports, ceremonies, elections, protests, military intervention, economic crises, etc. The MMED dataset is collected by the following the principles of high relevance in supporting the application needs, a wide range of event types, non-ambiguity of the event labels, imbalanced event clusters, and difficulty discriminating the event labels. The dataset can stimulate innovative research on related challenging problems, such as (weakly aligned) cross-modal retrieval and cross-domain event discovery, inspire visual relation mining and reasoning, etc. For comparisons, 15 baselines for two scenarios have been quantitatively and qualitatively evaluated using the dataset.



中文翻译:

MMED:多域多模式事件数据集

在这项工作中,我们发布了一个多域和多模式事件数据集(MMED),其中包含从数百个新闻媒体站点(例如Yahoo News,BBC News等)收集的25,052条文字新闻文章,以及在Flickr被成千上万的社交媒体用户使用。根据410个现实事件对专业记者的文章和业余用户共享的图像进行注释,这些事件包括紧急情况,自然灾害,体育,庆典,选举,抗议,军事干预,经济危机等。MMED数据集已收集通过遵循与支持应用程序需求高度相关的原则,广泛的事件类型,事件标签的明确性,不平衡的事件簇以及难以区分事件标签的原则。

更新日期:2020-06-18
down
wechat
bug