当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multistage BiCross Encoder: Team GATE Entry for MLIA Multilingual Semantic Search Task 2
arXiv - CS - Information Retrieval Pub Date : 2021-01-08 , DOI: arxiv-2101.03013
Iknoor Singh, Carolina Scarton, Kalina Bontcheva

The Coronavirus (COVID-19) pandemic has led to a rapidly growing `infodemic' online. Thus, the accurate retrieval of reliable relevant data from millions of documents about COVID-19 has become urgently needed for the general public as well as for other stakeholders. The COVID-19 Multilingual Information Access (MLIA) initiative is a joint effort to ameliorate exchange of COVID-19 related information by developing applications and services through research and community participation. In this work, we present a search system called Multistage BiCross Encoder, developed by team GATE for the MLIA task 2 Multilingual Semantic Search. Multistage BiCross-Encoder is a sequential three stage pipeline which uses the Okapi BM25 algorithm and a transformer based bi-encoder and cross-encoder to effectively rank the documents with respect to the query. The results of round 1 show that our models achieve state-of-the-art performance for all ranking metrics for both monolingual and bilingual runs.

中文翻译:

多级BiCross编码器:MLIA多语言语义搜索任务2的团队GATE条目

冠状病毒(COVID-19)大流行已导致在线“信息流行病”迅速增长。因此,迫切需要广大公众以及其他利益相关者从数百万篇有关COVID-19的文档中准确检索可靠的相关数据。COVID-19多语言信息访问(MLIA)计划是一项共同的工作,旨在通过研究和社区参与来开发应用程序和服务,从而改善与COVID-19相关的信息的交换。在这项工作中,我们提出了一个名为Multistage BiCross Encoder的搜索系统,该系统由GATE团队针对MLIA任务2多语言语义搜索开发。多级BiCross-Encoder是一个顺序的三级流水线,它使用Okapi BM25算法以及基于变压器的双编码器和交叉编码器相对于查询有效地对文档进行排名。
更新日期:2021-01-11
down
wechat
bug