当前位置: X-MOL 学术Scientometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Incorporating citation impact into analysis of research trends
Scientometrics ( IF 3.5 ) Pub Date : 2020-05-18 , DOI: 10.1007/s11192-020-03508-3
Minchul Lee , Min Song

In the past decades, there have been a number of proposals to apply topic modeling to research trend analysis. However, most of previous studies have relied primarily on document publication year and have not incorporated the impact of articles into trend analysis. Unlike previous trend analysis using topic modeling, we incorporate citation count, which can be viewed as the impact of articles, into trend analysis to shed a new light on the understanding of research trends. To this end, we propose the Generalized Dirichlet multinomial regression (g-DMR) topic model, which improves the DMR topic model by replacing a linear inner product in topic priors, $$\mathrm{exp}\left({{\varvec{x}}}_{d}\cdot {{\varvec{\lambda}}}_{t}\right),$$ exp x d · λ t , with a more general form based on topic distribution function (TDF), $$\mathrm{exp}\left(\mathrm{f}\left({{\varvec{x}}}_{d}\right)\right)+\upvarepsilon$$ exp f x d + ε . We use multidimensional Legendre Polynomial as TDF to capture publication year and the number of citations per publication simultaneously. In DMR model, since metadata could affect the document-topic distribution only monotonically and continuous values such as publication year and citation count need to be discretized, it is difficult to view the dynamic change of each topic. But the g-DMR model can handle various orthogonal continuous variables with arbitrary order of polynomial, so it can show more dynamic topic trends. Two major experiments show that the proposed model is better suited for topic generation with consideration of citation impact than DMR does for the trend analysis in the field of Library and Information Science in general and Text Mining in particular.

中文翻译:

将引文影响纳入研究趋势分析

在过去的几十年中,已经有许多建议将主题建模应用于研究趋势分析。然而,以往的大多数研究主要依赖于文献出版年份,并没有将文章的影响纳入趋势分析。与之前使用主题建模的趋势分析不同,我们将引用计数(可以视为文章的影响)纳入趋势分析,以揭示对研究趋势的理解。为此,我们提出了广义狄利克雷多项回归 (g-DMR) 主题模型,它通过替换主题先验中的线性内积来改进 DMR 主题模型,$$\mathrm{exp}\left({{\varvec{ x}}}_{d}\cdot {{\varvec{\lambda}}}_{t}\right),$$ exp xd · λ t ,具有基于主题分布函数(TDF)的更一般形式,$$\mathrm{exp}\left(\mathrm{f}\left({{\varvec{x}}}_{d}\right)\right)+\upvarepsilon$$ exp fxd + ε . 我们使用多维勒让德多项式作为 TDF 来同时捕获出版年份和每篇出版物的引用次数。在 DMR 模型中,由于元数据只能单调地影响文献-主题分布,并且需要离散化出版年份和引用次数等连续值,因此很难查看每个主题的动态变化。但是g-DMR模型可以处理多项式任意阶数的各种正交连续变量,因此可以表现出更动态的话题趋势。
更新日期:2020-05-18
down
wechat
bug