Skip to main content
Log in

Incorporating citation impact into analysis of research trends

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

In the past decades, there have been a number of proposals to apply topic modeling to research trend analysis. However, most of previous studies have relied primarily on document publication year and have not incorporated the impact of articles into trend analysis. Unlike previous trend analysis using topic modeling, we incorporate citation count, which can be viewed as the impact of articles, into trend analysis to shed a new light on the understanding of research trends. To this end, we propose the Generalized Dirichlet multinomial regression (g-DMR) topic model, which improves the DMR topic model by replacing a linear inner product in topic priors, \(\mathrm{exp}\left({{\varvec{x}}}_{d}\cdot {{\varvec{\lambda}}}_{t}\right),\) with a more general form based on topic distribution function (TDF), \(\mathrm{exp}\left(\mathrm{f}\left({{\varvec{x}}}_{d}\right)\right)+\upvarepsilon\). We use multidimensional Legendre Polynomial as TDF to capture publication year and the number of citations per publication simultaneously. In DMR model, since metadata could affect the document-topic distribution only monotonically and continuous values such as publication year and citation count need to be discretized, it is difficult to view the dynamic change of each topic. But the g-DMR model can handle various orthogonal continuous variables with arbitrary order of polynomial, so it can show more dynamic topic trends. Two major experiments show that the proposed model is better suited for topic generation with consideration of citation impact than DMR does for the trend analysis in the field of Library and Information Science in general and Text Mining in particular.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://www.chokkan.org/software/liblbfgs/.

References

  • Andrews, L. C., & Andrews, L. C. (1992). Special functions of mathematics for engineers. New York: McGraw-Hill.

    MATH  Google Scholar 

  • Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. In Proceedings of the 23rd international conference on machine learning, (pp. 113–120).

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research,3, 993–1022.

    MATH  Google Scholar 

  • Bouabid, H., Paul-Hus, A., & Larivière, V. (2016). Scientific collaboration and high-technology exchanges among BRICS and G-7 countries. Scientometrics,106, 873–899.

    Article  Google Scholar 

  • Cavacini, A. (2016). Recent trends in Middle Eastern scientific production. Scientometrics,109, 423–432.

    Article  Google Scholar 

  • Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L., & Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. In Advances in neural information processing systems (pp. 288–296).

  • Chen, C., Wang, Z., Li, W., & Sun, X. (2018). Modeling scientific influence for research trending topic prediction. In Thirty-second AAAI conference on artificial intelligence.

  • Dietz, L., Bickel, S., & Scheffer, T. (2007). Unsupervised prediction of citation influences. In Proceedings of the 24th international conference on machine learning (pp. 233–240).

  • Dou, H., & Kister, J. (2016). Research and development on Moringa Oleifera-Comparison between academic research and patents. World Patent Information,47, 21–33.

    Article  Google Scholar 

  • Finardi, U., & Buratti, A. (2016). Scientific collaboration framework of BRICS countries: An analysis of international coauthorship. Scientometrics,109, 433–446.

    Article  Google Scholar 

  • Fukugawa, N. (2016). Knowledge creation and dissemination by Kosetsushi in sectoral innovation systems: insights from patent data. Scientometrics,109, 2303–2327.

    Article  Google Scholar 

  • Gerow, A., Hu, Y., Boyd-Graber, J., Blei, D. M., & Evans, J. A. (2018). Measuring discursive influence across scholarship. Proceedings of the National Academy of Sciences,115, 3308–3313.

    Article  Google Scholar 

  • Gerrish, S., & Blei, D. M. (2010). A Language-based Approach to Measuring Scholarly Impact. ICML,10, 375–382.

    Google Scholar 

  • Griffiths, T. L., Jordan, M. I., Tenenbaum, J. B., & Blei, D. M. (2004). Hierarchical topic models and the nested chinese restaurant process. In Advances in neural information processing systems (pp. 17–24).

  • Hall, D., Jurafsky, D., & Manning, C. D. (2008). Studying the history of ideas using topic models. In Proceedings of the conference on empirical methods in natural language processing (pp. 363–371).

  • Hawkins, D. T. (2001). Bibliometrics of electronic journals in information science. Information Research,7, 7.

    Google Scholar 

  • Jabeen, M., Yun, L., Rafiq, M., & Jabeen, M. (2015). Research productivity of library scholars: Bibliometric analysis of growth and trends of LIS publications. New Library World,116, 433–454.

    Article  Google Scholar 

  • Jo, Y., Hopcroft, J. E., & Lagoze, C. (2011). The web of topics: discovering the topology of topic evolution in a corpus. In Proceedings of the 20th international conference on World wide web (pp. 257–266).

  • Kang, K., & Sohn, S. Y. (2016). Evaluating the patenting activities of pharmaceutical research organizations based on new technology indices. Journal of Informetrics,10, 74–81.

    Article  Google Scholar 

  • Kawamae, N., & Higashinaka, R. (2010). Trend detection model. In Proceedings of the 19th international conference on World wide web (pp. 1129–1130).

  • Kim, M., Baek, I., & Song, M. (2018). Topic diffusion analysis of a weighted citation network in biomedical literature. Journal of the Association for Information Science and Technology,69, 329–342.

    Article  Google Scholar 

  • Li, L.-L., Ding, G., Feng, N., Wang, M.-H., & Ho, Y.-S. (2009). Global stem cell research trend: Bibliometric analysis as a tool for mapping of trends from 1991 to 2006. Scientometrics,80, 39–58.

    Article  Google Scholar 

  • Liu, L., & Mei, S. (2016). Visualizing the GVC research: a co-occurrence network based bibliometric analysis. Scientometrics,109, 953–977.

    Article  Google Scholar 

  • Lv, P. H., Wang, G.-F., Wan, Y., Liu, J., Liu, Q., & Ma, F.-C. (2011). Bibliometric trend analysis on global graphene research. Scientometrics,88, 399–419.

    Article  Google Scholar 

  • Maisonobe, M., Eckert, D., Grossetti, M., Jégou, L., & Milard, B. (2016). The world network of scientific collaborations between cities: Domestic or international dynamics? Journal of Informetrics,10, 1025–1036.

    Article  Google Scholar 

  • Mann, G. S., Mimno, D., & McCallum, A. (2006). Bibliometric impact measures leveraging topic analysis. In Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries (pp. 65–74).

  • Milanez, D. H., Noyons, E., & Faria, L. I. (2016). A delineating procedure to retrieve relevant publication data in research areas: The case of nanocellulose. Scientometrics,107, 627–643.

    Article  Google Scholar 

  • Mimno, D., & McCallum, A. (2012). Topic models conditioned on arbitrary features with dirichlet-multinomial regression. arXiv preprint, arXiv:1206.3278.

  • Moed, H. F. (2016). Iran’s scientific dominance and the emergence of South-East Asian countries as scientific collaborators in the Persian Gulf Region. Scientometrics,108, 305–314.

    Article  Google Scholar 

  • Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic evaluation of topic coherence. In Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics, (pp. 100–108).

  • Roberts, M. E., Stewart, B. M., Tingley, D., Lucas, C., Leder-Luis, J., Gadarian, S. K., et al. (2014). Structural topic models for open-ended survey responses. American Journal of Political Science,58, 1064–1082.

    Article  Google Scholar 

  • Sethi, B. B., & Panda, K. C. (2012). Growth and nature of international LIS research: An analysis of two journals. The International Information & Library Review,44, 86–99.

    Article  Google Scholar 

  • Song, M., Kim, S., & Lee, K. (2017). Ensemble analysis of topical journal ranking in bioinformatics. Journal of the Association for Information Science and Technology,68, 1564–1583.

    Article  Google Scholar 

  • Song, M., Kim, S., Zhang, G., Ding, Y., & Chambers, T. (2014). Productivity and influence in bioinformatics: A bibliometric analysis using PubMed central. Journal of the Association for Information Science and Technology,65, 352–371.

    Article  Google Scholar 

  • Stein, M.-K., Galliers, R. D., & Whitley, E. A. (2016). Twenty years of the European information systems academy at ECIS: Emergent trends and research topics. European Journal of Information Systems,25, 1–15.

    Article  Google Scholar 

  • Teh, Y. W., Jordan, M. I., Beal, M. J., & Blei, D. M. (2005). Sharing clusters among related groups: Hierarchical Dirichlet processes. Advances in neural information processing systems (pp. 1385–1392).

  • Timakum, T., Kim, G., & Song, M. (2018). A data-driven analysis of the knowledge structure of library science with full-text journal articles. Journal of Librarianship and Information Science. https://doi.org/10.1177/0961000618793977.

    Article  Google Scholar 

  • Tran, B., Pham, T., Ha, G., Ngo, A., Nguyen, L., Vu, T., et al. (2018). A bibliometric analysis of the global research trend in child maltreatment. International Journal of Environmental Research and Public Health,15, 1456.

    Article  Google Scholar 

  • Wang, C., Blei, D., & Heckerman, D. (2012). Continuous time dynamic topic models. arXiv preprint, arXiv:1206.3298.

  • Wang, X., & McCallum, A. (2006). Topics over time: a non-Markov continuous-time model of topical trends. In Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 424–433).

  • Wang, X., Zhai, C., & Roth, D. (2013). Understanding evolution of research themes: a probabilistic generative model for citations. In Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, (pp. 1115–1123).

  • Xu, S., Hao, L., An, X., Yang, G., & Wang, F. (2019). Emerging research topics detection with multiple machine learning models. Journal of Informetrics,13, 100983.

    Article  Google Scholar 

  • Yan, F., Xu, N., & Qi, Y. (2009). Parallel inference for latent dirichlet allocation on graphics processing units. Advances in neural information processing systems (pp. 2134–2142).

  • Zhang, Y., Chen, K., Zhu, G., Yam, R. C., & Guan, J. (2016). Inter-organizational scientific collaborations and policy effects: An ego-network evolutionary perspective of the Chinese Academy of Sciences. Scientometrics,108, 1383–1415.

    Article  Google Scholar 

  • Zhao, Y., & Zhao, R. (2016). An evolutionary analysis of collaboration networks in scientometrics. Scientometrics,107, 759–772.

    Article  Google Scholar 

  • Zhao, Y., Li, D., Han, M., Li, C., & Li, D. (2016). Characteristics of research collaboration in biotechnology in China: Evidence from publications indexed in the SCIE. Scientometrics,107, 1373–1387.

    Article  Google Scholar 

  • Zou, C. (2018). Analyzing research trends on drug safety using topic modeling. Expert Opinion on Drug Safety,17, 629–636.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea Grant funded by the Korean Government (NRF-2018S1A3A2075114).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min Song.

Appendices

Appendix 1

See Table 10.

Table 10 Topic–word results from text mining dataset at K = 30

Appendix 2

See Table 11.

Table 11 Topic–word results from LIS dataset at K = 40

Appendix 3

See Table 12.

Table 12 Welch’s t-test result between topic coherence of DMR and of g-DMR

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, M., Song, M. Incorporating citation impact into analysis of research trends. Scientometrics 124, 1191–1224 (2020). https://doi.org/10.1007/s11192-020-03508-3

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-020-03508-3

Keywords

Navigation