Foundations and Trends in Information Retrieval ( IF 10.4 ) Pub Date : 2009-12-16 , DOI: 10.1561/1500000019 Stephen Robertson , Hugo Zaragoza
The Probabilistic Relevance Framework (PRF) is a formal framework for document retrieval, grounded in work done in the 1970–1980s, which led to the development of one of the most successful text-retrieval algorithms, BM25. In recent years, research in the PRF has yielded new retrieval models capable of taking into account document meta-data (especially structure and link-graph information). Again, this has led to one of the most successful Web-search and corporate-search algorithms, BM25F. This work presents the PRF from a conceptual point of view, describing the probabilistic modelling assumptions behind the framework and the different ranking algorithms that result from its application: the binary independence model, relevance feedback models, BM25 and BM25F. It also discusses the relation between the PRF and other statistical models for IR, and covers some related topics, such as the use of non-textual features, and parameter optimisation for models with free parameters.
中文翻译:
概率相关性框架:BM25及更高版本
概率相关性框架(PRF)是一个正式的文档检索框架,该框架以1970-1980年代所做的工作为基础,该框架导致了最成功的文本检索算法之一BM25的开发。近年来,PRF中的研究产生了新的检索模型,该模型能够考虑文档元数据(尤其是结构和链接图信息)。再次,这导致了最成功的Web搜索和公司搜索算法之一BM25F。这项工作从概念的角度介绍了PRF,描述了框架背后的概率建模假设以及由其应用产生的不同排名算法:二进制独立性模型,相关性反馈模型,BM25和BM25F。它还讨论了PRF和其他IR统计模型之间的关系,