Abstract
We argue that classic citation-based scientific document clustering approaches, like co-citation or Bibliographic Coupling, lack to leverage the social-usage of the scientific literature originate through online information dissemination platforms, such as Twitter. In this paper, we present the methodology Tweet Coupling, which measures the similarity between two or more scientific documents if one or more Twitter users mention them in the tweet(s). We evaluate our proposal on an altmetric dataset, which consists of 3081 scientific documents and 8299 unique Twitter users. By employing the clustering approaches of Bibliographic Coupling and Tweet Coupling, we find the relationship between the bibliographic and tweet coupled scientific documents. Further, using VOSviewer, we empirically show that Tweet Coupling appears to be a better clustering methodology to generate cohesive clusters since it groups similar documents from the subfields of the selected field, in contrast to the Bibliographic Coupling approach that groups cross-disciplinary documents in the same cluster.
Similar content being viewed by others
Notes
The data and code to reproduce or extend this work is available at the following URL: https://github.com/slab-itu/tweet_coupling.
References
Adie, E., & Roe, W. (2013). Altmetric: Enriching scholarly content with article-level discussion and metrics. Learned Publishing,26(1), 11–17.
Amsler, R. A. (1972). Applications of citation-based automatic classification. Austin: Linguistics Research Center, University of Texas at Austin.
Ananiadou, S., Thompson, P., & Nawaz, R. (2013). Enhancing search: Events and their discourse context. In International conference on intelligent text processing and computational linguistics (pp. 318–334). Springer, Berlin.
Batista-Navarro, R. T., Kontonatsios, G., Mihăilă, C., Thompson, P., Rak, R., Nawaz, R., … & Ananiadou, S. (2013). Facilitating the analysis of discourse phenomena in an interoperable NLP platform. In International conference on intelligent text processing and computational linguistics (pp. 559–571). Springer, Berlin.
Butler, J. S., Kaye, I. D., Sebastian, A. S., Wagner, S. C., Morrissey, P. B., Schroeder, G. D., et al. (2017). The evolution of current research impact metrics: From bibliometrics to altmetrics? Clinical Spine Surgery,30(5), 226–228.
Costas, R., Zahedi, Z., & Wouters, P. (2015). Do “altmetrics” correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective. Journal of the Association for Information Science and Technology,66(10), 2003–2019.
Didegah, F., & Thelwall, M. (2018). Co-saved, co-tweeted, and co-cited networks. Journal of the Association for Information Science and Technology,69(8), 959–973.
Erdt, M., Nagarajan, A., Sin, S.-C. J., & Theng, Y.-L. (2016). Altmetrics: An analysis of the state-of-the-art in measuring research impact on social media. Scientometrics,109(2), 1117–1166.
Garfield, E. (1979). Is citation analysis a legitimate evaluation tool? Scientometrics,1(4), 359–375.
Gipp, B., & Beel, J. (2009). Citation proximity analysis (CPA): A new approach for identifying related work based on co-citation analysis. In ISSI’09: 12th international conference on scientometrics and informetrics, pp. 571–575.
Habib, R., & Afzal, M. T. (2019). Sections-based bibliographic coupling for research paper recommendation. Scientometrics,119, 643–656.
Hassan, S.-U., & Gillani, U. A. (2016). Altmetrics of “altmetrics” using Google Scholar, Twitter, Mendeley, Facebook, Google-plus, CiteULike, Blogs and Wiki. ArXiv Preprint arXiv:1603.07992.
Hassan, S. U., & Haddawy, P. (2013). Measuring international knowledge flows and scholarly impact of scientific research. Scientometrics,94(1), 163–179.
Hassan, S. U., & Haddawy, P. (2015). Analyzing knowledge flows of scientific literature through semantic links: A case study in the field of energy. Scientometrics,103(1), 33–46.
Hassan, S.-U., Imran, M., Gillani, U., Aljohani, N. R., Bowman, T. D., & Didegah, F. (2017). Measuring social media activity of scientific literature: An exhaustive comparison of scopus and novel altmetrics big data. Scientometrics,113(2), 1037–1057.
Haustein, S., Bowman, T. D., & Costas, R. (2015). Interpreting “altmetrics”: Viewing acts on social media through the lens of citation and social theories. Theories of Informetrics and Scholarly Communication, Vol. 372.
Haustein, S., Bowman, T. D., Holmberg, K., Tsou, A., Sugimoto, C. R., & Larivière, V. (2016). Tweets as impact indicators: Examining the implications of automated “bot” accounts on Twitter. Journal of the Association for Information Science and Technology,67(1), 232–238.
Haustein, S., Costas, R., & Larivière, V. (2015b). Characterizing social media metrics of scholarly papers: The effect of document properties and collaboration patterns. PLoS ONE,10(3), e0120495.
Heffernan, K., & Teufel, S. (2018). Identifying problems and solutions in scientific text. Scientometrics,116(2), 1367–1382.
Hellsten, I., & Leydesdorff, L. (2017). Automated analysis of topic-actor networks on Twitter: New approach to the analysis of socio-semantic networks. ArXiv Preprint arXiv:1711.08387.
Hellsten, I., Opthof, T., & Leydesdorff, L. (2019). N-mode network approach for socio-semantic analysis of scientific publications. Poetics. https://doi.org/10.1016/j.poetic.2019.101427.
Holmberg, K., & Thelwall, M. (2014). Disciplinary differences in Twitter scholarly communication. Scientometrics,101(2), 1027–1042.
Huang, A. (2008). Similarity measures for text document clustering. In Proceedings of the sixth New Zealand computer science research student conference (NZCSRSC2008), (Vol. 4, pp. 9–56) Christchurch, New Zealand.
Joubert, M., & Costas, R. (2019). Getting to know science Tweeters: A pilot analysis of South African twitter users tweeting about research articles. Journal of Altmetrics, 2(1), 2. https://doi.org/10.29024/joa.8.
Karimi, S., Moraes, L., Das, A., Shakery, A., & Verma, R. (2018). Citance-based retrieval and summarization using IR and machine learning. Scientometrics,116(2), 1331–1366.
Kessler, M. M. (1963). Bibliographic coupling between scientific papers. American Documentation,14(1), 10–25.
Lawrence, S., Bollacker, K., & Giles, C. L. (1999). Indexing and retrieval of scientific literature. In Proceedings of the eighth international conference on information and knowledge management, (pp. 139–146). ACM.
Liu, X. Z., & Fang, H. (2017). What we can learn from tweets linking to research papers. Scientometrics,111(1), 349–369.
Martyn, J. (1964). Bibliographic coupling. Journal of Documentation,20(4), 236.
Melero, R. (2015). Altmetrics–a complement to conventional metrics. Biochemia Medica: Biochemia Medica,25(2), 152–160.
Mesbah, S., Fragkeskos, K., Lofi, C., Bozzon, A., & Houben, G. J. (2017). Facet embeddings for explorative analytics in digital libraries. In International conference on theory and practice of digital libraries (pp. 86–99). Springer, Cham.
Mohammadi, E., & Thelwall, M. (2014). Mendeley readership altmetrics for the social sciences and humanities: Research evaluation and knowledge flows. Journal of the Association for Information Science and Technology,65(8), 1627–1638.
Nawaz, R., Thompson, P., & Ananiadou, S. (2012). Identification of manner in bio-events. In Proceedings of the eight international conference on language resources and evaluation (LREC’12), pp. 3505–3510.
Peoples, B. K., Midway, S. R., Sackett, D., Lynch, A., & Cooney, P. B. (2016). Twitter predicts citation rates of ecological research. PLoS ONE,11(11), e0166570.
Priem, J., & Costello, K. L. (2010). How and why scholars cite on Twitter. Proceedings of the American Society for Information Science and Technology,47(1), 1–4.
Priem, J., Piwowar, H. A., & Hemminger, B. M. (2012). Altmetrics in the wild: Using social media to explore scholarly impact. In The ACM web science conference 2012 workshop.
Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2010). Altmetrics: A manifesto, 26 October 2010. http://altmetrics.org/manifesto.
Robinson-Garcia, N., Arroyo-Machado, W., & Torres-Salinas, D. (2019). Mapping social media attention in microbiology: Identifying main topics and actors. FEMS Microbiology Letters,366(7), fnz075.
Robinson-García, N., Costas, R., Isett, K., Melkers, J., & Hicks, D. (2017). The unbearable emptiness of tweeting—About journal articles. PloS ONE. https://doi.org/10.1371/journal.pone.0183551.
Robinson-García, N., Torres-Salinas, D., Zahedi, Z., & Costas, R. (2014). New data, new possibilities: Exploring the insides of Altmetric. com. El Profesional de la Información,23(4), 359–366.
Safder, I., & Hassan, S. U. (2019). Bibliometric-enhanced information retrieval: A novel deep feature engineering approach for algorithm searching from full-text publications. Scientometrics,119(1), 257–277.
Said, A., Bowman, T. D., Abbasi, R. A., Aljohani, N. R., Hassan, S. U., & Nawaz, R. (2019). Mining network-level properties of Twitter altmetrics data. Scientometrics,120(1), 217–235.
Shardlow, M., Batista-Navarro, R., Thompson, P., Nawaz, R., McNaught, J., & Ananiadou, S. (2018). Identification of research hypotheses and new knowledge from scientific literature. BMC Medical Informatics and Decision Making,18(1), 46.
Shu, F., Lou, W., & Haustein, S. (2018). Can Twitter increase the visibility of Chinese publications? Scientometrics,116(1), 505–519.
Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science,24(4), 265–269.
Thelwall, M., Haustein, S., Larivière, V., & Sugimoto, C. R. (2013a). Do altmetrics work? Twitter and ten other social web services. PLoS ONE,8(5), e64841.
Thelwall, M., Tsou, A., Weingart, S., Holmberg, K., & Haustein, S. (2013b). Tweeting links to academic articles. Cybermetrics: International Journal of Scientometrics, Informetrics and Bibliometrics,17, 1–8.
Thijs, B., & Glänzel, W. (2018). The contribution of the lexical component in hybrid clustering, the case of four decades of “Scientometrics”. Scientometrics,115(1), 21–33.
Trueger, N. S., Thoma, B., Hsu, C. H., Sullivan, D., Peters, L., & Lin, M. (2015). The altmetric score: A new measure for article-level dissemination and impact. Annals of Emergency Medicine,66(5), 549–553.
Yan, E., & Ding, Y. (2012). Scholarly network similarities: How bibliographic coupling networks, citation networks, cocitation networks, topical networks, coauthorship networks, and coword networks relate to each other. Journal of the American Society for Information Science and Technology,63(7), 1313–1326.
Zahedi, Z., Costas, R., Larivière, V., & Haustein, S. (2017). What makes papers visible on social media? An analysis of various document characteristics. In Proceedings of the 21ST international conference on science and technology indicators. Valencia (Spain).
Zahedi, Z., Costas, R., & Wouters, P. (2014). How well developed are altmetrics? A cross-disciplinary analysis of the presence of ‘alternative metrics’ in scientific publications. Scientometrics,101(2), 1491–1513.
Zhao, D., & Strotmann, A. (2014). The knowledge base and research front of information science 2006–2010: An author cocitation and bibliographic coupling analysis. Journal of the Association for Information Science and Technology,65(5), 995–1006.
Acknowledgements
The authors (Saeed-Ul Hassan & Mudassir Shabbir) were funded by the CIPL (National Center in Big Data and Cloud Computing (NCBC) grant, received from the Planning Commission of Pakistan, through Higher Education Commission (HEC) of Pakistan. This work was partially supported by the Spanish Ministry of Science and Technology under the projects TIN2017-89517-P and TIN2017-83445-P. Eugenio Martínez Cámara was supported by the Spanish Government Programme Juan de la Cierva Incorporación (IJC2018-036092-I).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hassan, SU., Aljohani, N.R., Shabbir, M. et al. Tweet Coupling: a social media methodology for clustering scientific publications. Scientometrics 124, 973–991 (2020). https://doi.org/10.1007/s11192-020-03499-1
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-020-03499-1