On the Replicability of Combining Word Embeddings and Retrieval Models,arXiv - CS - Information Retrieval

当前位置： X-MOL 学术 › arXiv.cs.IR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On the Replicability of Combining Word Embeddings and Retrieval Models
arXiv - CS - Information Retrieval Pub Date : 2020-01-13 , DOI: arxiv-2001.04484
Luca Papariello, Alexandros Bampoulidis, Mihai Lupu

We replicate recent experiments attempting to demonstrate an attractive hypothesis about the use of the Fisher kernel framework and mixture models for aggregating word embeddings towards document representations and the use of these representations in document classification, clustering, and retrieval. Specifically, the hypothesis was that the use of a mixture model of von Mises-Fisher (VMF) distributions instead of Gaussian distributions would be beneficial because of the focus on cosine distances of both VMF and the vector space model traditionally used in information retrieval. Previous experiments had validated this hypothesis. Our replication was not able to validate it, despite a large parameter scan space.

中文翻译：

词嵌入与检索模型结合的可复制性

我们复制了最近的实验，试图证明一个有吸引力的假设，即使用 Fisher 核框架和混合模型将词嵌入聚合到文档表示中，以及在文档分类、聚类和检索中使用这些表示。具体来说，假设是使用 von Mises-Fisher (VMF) 分布的混合模型而不是高斯分布将是有益的，因为重点关注 VMF 和传统用于信息检索的向量空间模型的余弦距离。之前的实验已经验证了这个假设。尽管参数扫描空间很大，但我们的复制无法对其进行验证。

更新日期：2020-01-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>