ExpFinder: An Ensemble Expert Finding Model Integrating $N$-gram Vector Space Model and $μ$CO-HITS,arXiv - CS - Information Retrieval

当前位置： X-MOL 学术 › arXiv.cs.IR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

ExpFinder: An Ensemble Expert Finding Model Integrating $N$-gram Vector Space Model and $μ$CO-HITS
arXiv - CS - Information Retrieval Pub Date : 2021-01-18 , DOI: arxiv-2101.06821
Yong-Bin KangFellow, IEEE, Hung DuFellow, IEEE, Abdur Rahim Mohammad ForkanFellow, IEEE, Prem Prakash JayaramanFellow, IEEE, Amir AryaniFellow, IEEE, Timos SellisFellow, IEEE

Finding an expert plays a crucial role in driving successful collaborations and speeding up high-quality research development and innovations. However, the rapid growth of scientific publications and digital expertise data makes identifying the right experts a challenging problem. Existing approaches for finding experts given a topic can be categorised into information retrieval techniques based on vector space models, document language models, and graph-based models. In this paper, we propose $\textit{ExpFinder}$, a new ensemble model for expert finding, that integrates a novel $N$-gram vector space model, denoted as $n$VSM, and a graph-based model, denoted as $\textit{$\mu$CO-HITS}$, that is a proposed variation of the CO-HITS algorithm. The key of $n$VSM is to exploit recent inverse document frequency weighting method for $N$-gram words and $\textit{ExpFinder}$ incorporates $n$VSM into $\textit{$\mu$CO-HITS}$ to achieve expert finding. We comprehensively evaluate $\textit{ExpFinder}$ on four different datasets from the academic domains in comparison with six different expert finding models. The evaluation results show that $\textit{ExpFinder}$ is a highly effective model for expert finding, substantially outperforming all the compared models in 19% to 160.2%.

中文翻译：

ExpFinder：集成$ N $ -gram矢量空间模型和$μ$ CO-HITS的集成专家查找模型

寻找专家在推动成功的合作并加快高质量的研究开发和创新中起着至关重要的作用。但是，科学出版物和数字专业知识数据的快速增长使寻找合适的专家成为一个具有挑战性的问题。现有的寻找给定主题的专家的方法可以归类为基于向量空间模型，文档语言模型和基于图的模型的信息检索技术。在本文中，我们提出$ \ textit {ExpFinder} $，这是一种用于专家发现的新集成模型，该模型集成了新颖的$ N $ -gram向量空间模型（表示为$ n $ VSM）和基于图的模型，表示为作为$ \ textit {$ \ mu $ CO-HITS} $，这是CO-HITS算法的拟议变体。$ n $ VSM的关键是对$ N $ -gram词利用最新的逆文档频率加权方法，而$ \ textit {ExpFinder} $将$ n $ VSM合并到$ \ textit {$ \ mu $ CO-HITS} $中实现专家发现。与六个不同的专家发现模型相比，我们对来自学术领域的四个不同数据集进行了全面评估$ \ textit {ExpFinder} $。评估结果表明，$ \ textit {ExpFinder} $是一种非常有效的专家发现模型，在所有比较模型中的表现都大大超过19％至160.2％。

更新日期：2021-01-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>