EveTAR : building a large-scale multi-task test collection over Arabic tweets,Information Retrieval Journal

当前位置： X-MOL 学术 › Inf. Retrieval J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

EveTAR : building a large-scale multi-task test collection over Arabic tweets
Information Retrieval Journal ( IF 1.7 ) Pub Date : 2017-12-21 , DOI: 10.1007/s10791-017-9325-7
Maram Hasanain , Reem Suwaileh , Tamer Elsayed , Mucahid Kutlu , Hind Almerekhi

This article introduces a new language-independent approach for creating a large-scale high-quality test collection of tweets that supports multiple information retrieval (IR) tasks without running a shared-task campaign. The adopted approach (demonstrated over Arabic tweets) designs the collection around significant (i.e., popular) events, which enables the development of topics that represent frequent information needs of Twitter users for which rich content exists. That inherently facilitates the support of multiple tasks that generally revolve around events, namely event detection, ad-hoc search, timeline generation, and real-time summarization. The key highlights of the approach include diversifying the judgment pool via interactive search and multiple manually-crafted queries per topic, collecting high-quality annotations via crowd-workers for relevancy and in-house annotators for novelty, filtering out low-agreement topics and inaccessible tweets, and providing multiple subsets of the collection for better availability. Applying our methodology on Arabic tweets resulted in EveTAR, the first freely-available tweet test collection for multiple IR tasks. EveTAR includes a crawl of 355M Arabic tweets and covers 50 significant events for which about 62K tweets were judged with substantial average inter-annotator agreement (Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating existing algorithms in the respective tasks. Results indicate that the new collection can support reliable ranking of IR systems that is comparable to similar TREC collections, while providing strong baseline results for future studies over Arabic tweets.

中文翻译：

EveTAR：在阿拉伯语推文上构建大规模的多任务测试集

本文介绍了一种新的独立于语言的方法，用于创建推文的大规模高质量测试集合，该推文支持多个信息检索（IR）任务，而无需运行共享任务活动。所采用的方法（通过阿拉伯文推文演示）围绕重大（即受欢迎）事件设计了馆藏，这可以开发出代表具有丰富内容的Twitter用户的频繁信息需求的主题。这从本质上促进了对通常围绕事件进行的多个任务的支持，即事件检测，即席搜索，时间轴生成和实时摘要。该方法的主要亮点包括通过交互式搜索和对每个主题的多个手动查询来使判断池多样化，通过人群工作者收集高质量的注释以获取相关性，并通过内部注释器进行新颖性收集，过滤掉低共识性的主题并无法访问推文，并提供集合的多个子集以提高可用性。将我们的方法论应用于阿拉伯文推文，产生了EveTAR，这是第一个免费提供的用于多个IR任务的tweet测试集合。EveTAR包括3.55亿条阿拉伯语推文的抓取，涵盖了50个重要事件，其中大约62K条推文被判断为具有实质上平均的注释者间协议（Kappa值为0.71）。我们通过评估相应任务中的现有算法来证明EveTAR的可用性。结果表明，新馆藏可以支持与类似TREC馆藏相媲美的IR系统的可靠排名，同时为以后的阿拉伯文推文研究提供了有力的基线结果。

更新日期：2017-12-21

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11