当前位置: X-MOL 学术arXiv.cs.DL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ARQMath Lab: An Incubator for Semantic Formula Search in zbMATH Open?
arXiv - CS - Digital Libraries Pub Date : 2020-12-04 , DOI: arxiv-2012.02413
Philipp Scharpf, Moritz Schubotz, Andre Greiner-Petter, Malte Ostendorff, Olaf Teschke, Bela Gipp

The zbMATH database contains more than 4 million bibliographic entries. We aim to provide easy access to these entries. Therefore, we maintain different index structures, including a formula index. To optimize the findability of the entries in our database, we continuously investigate new approaches to satisfy the information needs of our users. We believe that the findings from the ARQMath evaluation will generate new insights into which index structures are most suitable to satisfy mathematical information needs. Search engines, recommender systems, plagiarism checking software, and many other added-value services acting on databases such as the arXiv and zbMATH need to combine natural and formula language. One initial approach to address this challenge is to enrich the mostly unstructured document data via Entity Linking. The ARQMath Task at CLEF 2020 aims to tackle the problem of linking newly posted questions from Math Stack Exchange (MSE) to existing ones that were already answered by the community. To deeply understand MSE information needs, answer-, and formula types, we performed manual runs for tasks 1 and 2. Furthermore, we explored several formula retrieval methods: For task 2, such as fuzzy string search, k-nearest neighbors, and our recently introduced approach to retrieve Mathematical Objects of Interest (MOI) with textual search queries. The task results show that neither our automated methods nor our manual runs archived good scores in the competition. However, the perceived quality of the hits returned by the MOI search particularly motivates us to conduct further research about MOI.

中文翻译:

ARQMath Lab:zbMATH开放式中用于语义公式搜索的孵化器吗?

zbMATH数据库包含超过400万个书目条目。我们旨在提供对这些条目的轻松访问。因此,我们维护着不同的索引结构,包括公式索引。为了优化数据库中条目的可查找性,我们不断研究新方法来满足用户的信息需求。我们相信,ARQMath评估的结果将产生最深刻的见解,使索引结构最适合满足数学信息需求。搜索引擎,推荐系统,窃检查软件以及作用于数据库的许多其他增值服务,例如arXiv和zbMATH,都需要将自然语言和公式语言结合起来。解决这一挑战的一种初始方法是通过实体链接来丰富大部分非结构化文档数据。CLEF 2020的ARQMath任务旨在解决将Math Stack Exchange(MSE)中新发布的问题与社区已经回答的现有问题联系起来的问题。为了深入了解MSE信息需求,答案和公式类型,我们对任务1和2执行了手动运行。此外,我们探索了几种公式检索方法:对于任务2,例如模糊字符串搜索,k最近邻和我们的任务。最近推出了一种通过文本搜索查询来检索感兴趣的数学对象(MOI)的方法。任务结果表明,我们的自动化方法和手动运行均未在比赛中取得好成绩。但是,MOI搜索返回的命中感知质量尤其促使我们进行有关MOI的进一步研究。
更新日期:2020-12-07
down
wechat
bug