Skip to main content
Log in

Tweet Coupling: a social media methodology for clustering scientific publications

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

We argue that classic citation-based scientific document clustering approaches, like co-citation or Bibliographic Coupling, lack to leverage the social-usage of the scientific literature originate through online information dissemination platforms, such as Twitter. In this paper, we present the methodology Tweet Coupling, which measures the similarity between two or more scientific documents if one or more Twitter users mention them in the tweet(s). We evaluate our proposal on an altmetric dataset, which consists of 3081 scientific documents and 8299 unique Twitter users. By employing the clustering approaches of Bibliographic Coupling and Tweet Coupling, we find the relationship between the bibliographic and tweet coupled scientific documents. Further, using VOSviewer, we empirically show that Tweet Coupling appears to be a better clustering methodology to generate cohesive clusters since it groups similar documents from the subfields of the selected field, in contrast to the Bibliographic Coupling approach that groups cross-disciplinary documents in the same cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://www.altmetric.com/.

  2. The data and code to reproduce or extend this work is available at the following URL: https://github.com/slab-itu/tweet_coupling.

  3. http://www.vosviewer.com.

References

  • Adie, E., & Roe, W. (2013). Altmetric: Enriching scholarly content with article-level discussion and metrics. Learned Publishing,26(1), 11–17.

    Article  Google Scholar 

  • Amsler, R. A. (1972). Applications of citation-based automatic classification. Austin: Linguistics Research Center, University of Texas at Austin.

    Google Scholar 

  • Ananiadou, S., Thompson, P., & Nawaz, R. (2013). Enhancing search: Events and their discourse context. In International conference on intelligent text processing and computational linguistics (pp. 318–334). Springer, Berlin.

  • Batista-Navarro, R. T., Kontonatsios, G., Mihăilă, C., Thompson, P., Rak, R., Nawaz, R., … & Ananiadou, S. (2013). Facilitating the analysis of discourse phenomena in an interoperable NLP platform. In International conference on intelligent text processing and computational linguistics (pp. 559–571). Springer, Berlin.

  • Butler, J. S., Kaye, I. D., Sebastian, A. S., Wagner, S. C., Morrissey, P. B., Schroeder, G. D., et al. (2017). The evolution of current research impact metrics: From bibliometrics to altmetrics? Clinical Spine Surgery,30(5), 226–228.

    Article  Google Scholar 

  • Costas, R., Zahedi, Z., & Wouters, P. (2015). Do “altmetrics” correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective. Journal of the Association for Information Science and Technology,66(10), 2003–2019.

    Article  Google Scholar 

  • Didegah, F., & Thelwall, M. (2018). Co-saved, co-tweeted, and co-cited networks. Journal of the Association for Information Science and Technology,69(8), 959–973.

    Article  Google Scholar 

  • Erdt, M., Nagarajan, A., Sin, S.-C. J., & Theng, Y.-L. (2016). Altmetrics: An analysis of the state-of-the-art in measuring research impact on social media. Scientometrics,109(2), 1117–1166.

    Article  Google Scholar 

  • Garfield, E. (1979). Is citation analysis a legitimate evaluation tool? Scientometrics,1(4), 359–375.

    Article  Google Scholar 

  • Gipp, B., & Beel, J. (2009). Citation proximity analysis (CPA): A new approach for identifying related work based on co-citation analysis. In ISSI’09: 12th international conference on scientometrics and informetrics, pp. 571–575.

  • Habib, R., & Afzal, M. T. (2019). Sections-based bibliographic coupling for research paper recommendation. Scientometrics,119, 643–656.

    Article  Google Scholar 

  • Hassan, S.-U., & Gillani, U. A. (2016). Altmetrics of “altmetrics” using Google Scholar, Twitter, Mendeley, Facebook, Google-plus, CiteULike, Blogs and Wiki. ArXiv Preprint arXiv:1603.07992.

  • Hassan, S. U., & Haddawy, P. (2013). Measuring international knowledge flows and scholarly impact of scientific research. Scientometrics,94(1), 163–179.

    Article  Google Scholar 

  • Hassan, S. U., & Haddawy, P. (2015). Analyzing knowledge flows of scientific literature through semantic links: A case study in the field of energy. Scientometrics,103(1), 33–46.

    Article  Google Scholar 

  • Hassan, S.-U., Imran, M., Gillani, U., Aljohani, N. R., Bowman, T. D., & Didegah, F. (2017). Measuring social media activity of scientific literature: An exhaustive comparison of scopus and novel altmetrics big data. Scientometrics,113(2), 1037–1057.

    Article  Google Scholar 

  • Haustein, S., Bowman, T. D., & Costas, R. (2015). Interpreting “altmetrics”: Viewing acts on social media through the lens of citation and social theories. Theories of Informetrics and Scholarly Communication, Vol. 372.

  • Haustein, S., Bowman, T. D., Holmberg, K., Tsou, A., Sugimoto, C. R., & Larivière, V. (2016). Tweets as impact indicators: Examining the implications of automated “bot” accounts on Twitter. Journal of the Association for Information Science and Technology,67(1), 232–238.

    Article  Google Scholar 

  • Haustein, S., Costas, R., & Larivière, V. (2015b). Characterizing social media metrics of scholarly papers: The effect of document properties and collaboration patterns. PLoS ONE,10(3), e0120495.

    Article  Google Scholar 

  • Heffernan, K., & Teufel, S. (2018). Identifying problems and solutions in scientific text. Scientometrics,116(2), 1367–1382.

    Article  Google Scholar 

  • Hellsten, I., & Leydesdorff, L. (2017). Automated analysis of topic-actor networks on Twitter: New approach to the analysis of socio-semantic networks. ArXiv Preprint arXiv:1711.08387.

  • Hellsten, I., Opthof, T., & Leydesdorff, L. (2019). N-mode network approach for socio-semantic analysis of scientific publications. Poetics. https://doi.org/10.1016/j.poetic.2019.101427.

    Article  Google Scholar 

  • Holmberg, K., & Thelwall, M. (2014). Disciplinary differences in Twitter scholarly communication. Scientometrics,101(2), 1027–1042.

    Article  Google Scholar 

  • Huang, A. (2008). Similarity measures for text document clustering. In Proceedings of the sixth New Zealand computer science research student conference (NZCSRSC2008), (Vol. 4, pp. 9–56) Christchurch, New Zealand.

  • Joubert, M., & Costas, R. (2019). Getting to know science Tweeters: A pilot analysis of South African twitter users tweeting about research articles. Journal of Altmetrics, 2(1), 2. https://doi.org/10.29024/joa.8.

    Article  Google Scholar 

  • Karimi, S., Moraes, L., Das, A., Shakery, A., & Verma, R. (2018). Citance-based retrieval and summarization using IR and machine learning. Scientometrics,116(2), 1331–1366.

    Article  Google Scholar 

  • Kessler, M. M. (1963). Bibliographic coupling between scientific papers. American Documentation,14(1), 10–25.

    Article  Google Scholar 

  • Lawrence, S., Bollacker, K., & Giles, C. L. (1999). Indexing and retrieval of scientific literature. In Proceedings of the eighth international conference on information and knowledge management, (pp. 139–146). ACM.

  • Liu, X. Z., & Fang, H. (2017). What we can learn from tweets linking to research papers. Scientometrics,111(1), 349–369.

    Article  Google Scholar 

  • Martyn, J. (1964). Bibliographic coupling. Journal of Documentation,20(4), 236.

    Article  Google Scholar 

  • Melero, R. (2015). Altmetrics–a complement to conventional metrics. Biochemia Medica: Biochemia Medica,25(2), 152–160.

    Article  Google Scholar 

  • Mesbah, S., Fragkeskos, K., Lofi, C., Bozzon, A., & Houben, G. J. (2017). Facet embeddings for explorative analytics in digital libraries. In International conference on theory and practice of digital libraries (pp. 86–99). Springer, Cham.

  • Mohammadi, E., & Thelwall, M. (2014). Mendeley readership altmetrics for the social sciences and humanities: Research evaluation and knowledge flows. Journal of the Association for Information Science and Technology,65(8), 1627–1638.

    Article  Google Scholar 

  • Nawaz, R., Thompson, P., & Ananiadou, S. (2012). Identification of manner in bio-events. In Proceedings of the eight international conference on language resources and evaluation (LREC’12), pp. 3505–3510.

  • Peoples, B. K., Midway, S. R., Sackett, D., Lynch, A., & Cooney, P. B. (2016). Twitter predicts citation rates of ecological research. PLoS ONE,11(11), e0166570.

    Article  Google Scholar 

  • Priem, J., & Costello, K. L. (2010). How and why scholars cite on Twitter. Proceedings of the American Society for Information Science and Technology,47(1), 1–4.

    Article  Google Scholar 

  • Priem, J., Piwowar, H. A., & Hemminger, B. M. (2012). Altmetrics in the wild: Using social media to explore scholarly impact. In The ACM web science conference 2012 workshop.

  • Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2010). Altmetrics: A manifesto, 26 October 2010. http://altmetrics.org/manifesto.

  • Robinson-Garcia, N., Arroyo-Machado, W., & Torres-Salinas, D. (2019). Mapping social media attention in microbiology: Identifying main topics and actors. FEMS Microbiology Letters,366(7), fnz075.

    Article  Google Scholar 

  • Robinson-García, N., Costas, R., Isett, K., Melkers, J., & Hicks, D. (2017). The unbearable emptiness of tweeting—About journal articles. PloS ONE. https://doi.org/10.1371/journal.pone.0183551.

    Article  Google Scholar 

  • Robinson-García, N., Torres-Salinas, D., Zahedi, Z., & Costas, R. (2014). New data, new possibilities: Exploring the insides of Altmetric. com. El Profesional de la Información,23(4), 359–366.

    Article  Google Scholar 

  • Safder, I., & Hassan, S. U. (2019). Bibliometric-enhanced information retrieval: A novel deep feature engineering approach for algorithm searching from full-text publications. Scientometrics,119(1), 257–277.

    Article  Google Scholar 

  • Said, A., Bowman, T. D., Abbasi, R. A., Aljohani, N. R., Hassan, S. U., & Nawaz, R. (2019). Mining network-level properties of Twitter altmetrics data. Scientometrics,120(1), 217–235.

    Article  Google Scholar 

  • Shardlow, M., Batista-Navarro, R., Thompson, P., Nawaz, R., McNaught, J., & Ananiadou, S. (2018). Identification of research hypotheses and new knowledge from scientific literature. BMC Medical Informatics and Decision Making,18(1), 46.

    Article  Google Scholar 

  • Shu, F., Lou, W., & Haustein, S. (2018). Can Twitter increase the visibility of Chinese publications? Scientometrics,116(1), 505–519.

    Article  Google Scholar 

  • Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science,24(4), 265–269.

    Article  MathSciNet  Google Scholar 

  • Thelwall, M., Haustein, S., Larivière, V., & Sugimoto, C. R. (2013a). Do altmetrics work? Twitter and ten other social web services. PLoS ONE,8(5), e64841.

    Article  Google Scholar 

  • Thelwall, M., Tsou, A., Weingart, S., Holmberg, K., & Haustein, S. (2013b). Tweeting links to academic articles. Cybermetrics: International Journal of Scientometrics, Informetrics and Bibliometrics,17, 1–8.

    Google Scholar 

  • Thijs, B., & Glänzel, W. (2018). The contribution of the lexical component in hybrid clustering, the case of four decades of “Scientometrics”. Scientometrics,115(1), 21–33.

    Article  Google Scholar 

  • Trueger, N. S., Thoma, B., Hsu, C. H., Sullivan, D., Peters, L., & Lin, M. (2015). The altmetric score: A new measure for article-level dissemination and impact. Annals of Emergency Medicine,66(5), 549–553.

    Article  Google Scholar 

  • Yan, E., & Ding, Y. (2012). Scholarly network similarities: How bibliographic coupling networks, citation networks, cocitation networks, topical networks, coauthorship networks, and coword networks relate to each other. Journal of the American Society for Information Science and Technology,63(7), 1313–1326.

    Article  Google Scholar 

  • Zahedi, Z., Costas, R., Larivière, V., & Haustein, S. (2017). What makes papers visible on social media? An analysis of various document characteristics. In Proceedings of the 21ST international conference on science and technology indicators. Valencia (Spain).

  • Zahedi, Z., Costas, R., & Wouters, P. (2014). How well developed are altmetrics? A cross-disciplinary analysis of the presence of ‘alternative metrics’ in scientific publications. Scientometrics,101(2), 1491–1513.

    Article  Google Scholar 

  • Zhao, D., & Strotmann, A. (2014). The knowledge base and research front of information science 2006–2010: An author cocitation and bibliographic coupling analysis. Journal of the Association for Information Science and Technology,65(5), 995–1006.

    Article  Google Scholar 

Download references

Acknowledgements

The authors (Saeed-Ul Hassan & Mudassir Shabbir) were funded by the CIPL (National Center in Big Data and Cloud Computing (NCBC) grant, received from the Planning Commission of Pakistan, through Higher Education Commission (HEC) of Pakistan. This work was partially supported by the Spanish Ministry of Science and Technology under the projects TIN2017-89517-P and TIN2017-83445-P. Eugenio Martínez Cámara was supported by the Spanish Government Programme Juan de la Cierva Incorporación (IJC2018-036092-I).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eugenio Martínez-Cámara.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hassan, SU., Aljohani, N.R., Shabbir, M. et al. Tweet Coupling: a social media methodology for clustering scientific publications. Scientometrics 124, 973–991 (2020). https://doi.org/10.1007/s11192-020-03499-1

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-020-03499-1

Keywords

Navigation