当前位置: X-MOL 学术Found. Trends Inf. Ret. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Information Retrieval on the Blogosphere
Foundations and Trends in Information Retrieval ( IF 10.4 ) Pub Date : 2012-7-29 , DOI: 10.1561/1500000026
Rodrygo L. T. Santos

Blogs have recently emerged as a new open, rapidly evolving and reactive publishing medium on the Web. Rather than managed by a central entity, the content on the blogosphere — the collection of all blogs on the Web — is produced by millions of independent bloggers, who can write about virtually anything. This open publishing paradigm has led to a growing mass of user-generated content on theWeb, which can vary tremendously both in format and quality when looked at in isolation, but which can also reveal interesting patterns when observed in aggregation. One field particularly interested in studying how information is produced, consumed, and searched in the blogosphere is information retrieval. In this survey, we review the published literature on searching the blogosphere. In particular, we describe the phenomenon of blogging and the motivations for searching for information on blogs. We cover both the search tasks underlying blog searchers' information needs and the most successful approaches to these tasks. These include blog post and full blog search tasks, as well as blog-aided search tasks, such as trend and market analysis. Finally, we also describe the publicly available resources that support research on searching the blogosphere.

Disclaimer: Certain companies and/or products are identified in this paper in order to describe concepts and to specify experimental procedures adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the companies or products identified are necessarily the best available for the purpose.



中文翻译:

Blogosphere上的信息检索

博客最近成为一种新的开放,快速发展且反应迅速的网络发布媒体。博客圈中的内容(Web上所有博客的集合)不是由中央实体管理的,而是由数百万独立博客作者制作的,他们几乎可以撰写任何内容。这种开放的发布范式已导致Web上用户生成的内容的数量不断增长,如果单独查看,它们的格式和质量可能会发生巨大变化,但如果进行聚合观察,也可以显示出有趣的模式。信息检索是对研究博客圈中信息的产生,消费和搜索方式特别感兴趣的一个领域。在这项调查中,我们回顾了有关搜索博客圈的已发表文献。特别是,我们描述了博客现象以及在博客上搜索信息的动机。我们既涵盖了博客搜索者信息需求的搜索任务,又涵盖了针对这些任务的最成功方法。这些任务包括博客文章和完整的博客搜索任务,以及博客辅助的搜索任务,例如趋势和市场分析。最后,我们还描述了支持搜索博客圈研究的公共资源。

免责声明:本文确定了某些公司和/或产品,以便描述概念并充分指定实验程序。此类标识无意暗示美国国家标准技术研究所的推荐或认可,也无意暗示所标识的公司或产品必然是最佳的公司或产品。

更新日期:2012-07-29
down
wechat
bug