Bi-layer network analytics: A methodology for characterizing emerging general-purpose technologies
Introduction
Theoretical definitions of general-purpose technologies (GPTs) can be traced back to Paul David, who coined the idea from his observations of the widespread impact of electric dynamos on the productivity of the United States (US) during the 1920s (David, 1989). In the decade to follow, a well-recognized conception of what constitutes a GPT took hold – “pervasiveness, inherent potential for technical improvements, and innovational complementarities” (Bresnahan & Trajtenberg, 1995) – and measuring the many and various aspects of GPTs became a topic of increasing interest for many economists. One indicator in particular, generality, has been widely applied (Hall & Trajtenberg, 2004), and its standard calculations consider patent citations and their technological classes. In the 2000s, however, the attention of the science, technology, and innovation (ST&I) community turned to the unique features of nanotechnologies, triggering discussions on the notion of emerging GPTs (EGPTs), i.e., emerging technologies equipped with the features of GPTs (Graham & Iacopetta, 2009; Youtie et al., 2008). Although much research has been undertaken to define what classifies a technology as emergent, characterizing the “general purpose” component of an EGPT has proven to be far more difficult. Of the few studies specific to EGPTs, all assume emergence before testing constructs like generality, e.g., Schultz and Joutz (2010). Conversely, of the studies that primarily explore emergence, Rotolo et al.’s (2015) systematic review defined five attributes of emerging technologies, theoretically guiding the development of further measurements. Such continuous interests from the ST&I community endorse the potential significance of identifying EGPTs, that is, forseeing candidates of EGPTs may help policy makers and technology managers to take pre-emptive actions in strategic plan and R&D management (e.g., funding/investment allocation) and prepare for future global competitions. This crucial role and value of EGPTs motive our research. However, to our best knowledge, very few study has attempted to directly measure and classify emergence and generality at the same time. This, coupled with the lack of a quantitative measure for generality that does not solely depend on patents and citations, inspired us to establish a cohesive system of quantitative measures for identifying EGPTs that detects both emergence and generality.
Sharing a close interest with ST&I studies, bibliometrics is well recognized as a tool for supporting technology analysis and assessment. For example, it has been used to profile various technological areas (Chakraborty et al., 2015; Guo et al., 2010), identify emerging topics in science and technology (Glänzel & Thijs, 2012; Small et al., 2014), and track the pathways of technological change (Hou et al., 2018; Zhang et al., 2016; Zhou et al., 2014). More recently, the use of advanced data analytic technologies, such as topic models, streaming data analytics, and machine learning, have massively increased the amount of data traditional bibliometrics methods can process (Ding & Chen, 2014; Klavans & Boyack, 2017). They have also brought the ability to reveal hidden relationships (Zhang et al., 2018; Zhang et al., 2017b) and visualize complicated technological portfolios and innovation networks in highly interpretable ways (Börner et al., 2012; Suominen & Toivanen, 2016).
From a technical point of view, even though network analytics has long been a mainstay of social science (Borgatti et al., 2009), it was only introduced to bibliometric studies in the late 2000s. Originally used as a method for investigating research collaborations and the interactions between disciplines through bibliographic couplings (Yan et al., 2009; Yang et al., 2010), it has subsequently been combined with citation analysis to identify emerging topics and evaluate research impacts (Takeda & Kajikawa, 2009; Yan, 2015). Network analytics has also been explored for its ability to predict emerging technologies (Érdi et al., 2013) and to reveal hidden technological opportunities (Park & Yoon, 2018).
Yet, even with these techniques, developing a bibliometric model to identify EGPTs is still highly challenging. First, bibliometric models emphasize the use of historical data and have a natural connection with citation statistics, and thus are friendly for measuring generality. However, rapid developments in natural language processing in recent years are relaxing the field's dependence on patent archives to temper this philosophy of past as prologue, which may reveal insights directly from the semantics of the subject matter. Second, despite keen interest and many pilot studies on measuring and forecasting technical emergence (Carley et al., 2018), current bibliometric models are still falling short of truly “characterizing the potential of what is detected to be emerging” (Rotolo et al., 2015). Balancing generality with emergence to comprehensively characterize EGPTs further increases this challenge. Third, the bibliometric community is recognizing the benefits of link prediction as a way of identifying the likely technologies of tomorrow, but applying those methods to bibliographical information is not yet seamless. For example, theoretically mapping the key attributes of emerging technologies to the topological indicators of a bibliometric network can be problematic. Similarly, integrating heterogeneous bibliographical information into a single network so as to discover social impacts in addition to technological transitions still has issues.
Aiming to address these concerns, we propose a methodology based on bi-layer network analytics to quantitatively identify EGPTs. The methodology begins with the construction of a co-term network (the first layer) and a co-authorship network (the second layer). The two layers are then integrated into a bi-layer network that reflects both the substance of the technologies (i.e., terms) and the social entities (i.e., authors) engaged in their associated R&D. Typically, a traditional bibliometric network only reflects one indicator, e.g., term co-occurrence or co-authorship. The proposed bi-layer network charts both, offering a bibliometric solution that not only reveals the impact of key technologies, but also the authors and collaborative networks that are advancing these technologies. Integrating all this information into one analysis provides a novel perspective from which to draw comprehensive new insights.
To fully leverage this perspective, we adapted the five attributes of emerging technologies defined by Rotolo et al. (2015) into three new indicators capable of quantifying the topological structures in a bi-layer network, namely fundamentality, speciality, and sociality. Interestingly, among Rotolo et al.’s quintet of attributes, prominent impact is of particular interest to our endeavors. From their literature review, Rotolo et al. (2015) found that most scholars conceive of prominent impact as a force “exerted on the entire socio-economic system” – a concept, they add, that “comes very close to that of ‘general-purpose technologies’”. Discontented with the sweeping nature of this definition for the purposes of defining emergence, the authors proposed a more utilitarian version which acknowledges that an emerging technology's impact may be limited to one or a few domains. Thus, the intriguing argument was made that if we can measure prominent impact, we can measure generality as well. In part, this notion inspired the tripartite design of the above indicators.
With the network constructed and the topological structures measured, candidate future innovations are identified with a refined link prediction algorithm, using a weighted index of resource allocation. The algorithm considers the links both within each network layer, i.e., co-term and co-authorship links, as well as between layers, i.e., author-term links. Whether or not a link is predicted is based on the weighted index, which is an amalgamation of frequency statistics, including term co-occurrence, co-authorships, and author-term co-occurrence. Ultimately, the differences between the current network and the predicted network are the key to forecasting technological changes and, of course, which technologies are most likely to be EGPTs in the near future.
A case study on 17,445 articles published in 15 journals and conference proceedings on information science between 1 Jan 1996 and 31 Dec 2018 demonstrate the feasibility and reliability of the method. Additionally, the empirical insights derived from the study should provide decision support to researchers and policymakers in information science disciplines.
The rest of this paper is organized as follows: Section 2 reviews previous studies on bibliometrics for analyzing emerging technologies, network analytics with bibliometric indicators, and theoretical discussion on characterizing EGPTs from a bibliometric perspective. In Section 3, we outline the research framework of the study and introduce the proposed methodology. Section 4 follows, presenting the data, results, validation measurements, and empirical insights derived from the case study. The article concludes in Section 5 with a discussion on the technical and practical implications of our findings, the limitations of the study, and possible future directions of research.
Section snippets
Literature review and theoretical background
As a tool for analyzing emerging technologies, bibliometrics has attracted common interest from the bibliometrics and ST&I communities. Further, the rising enthusiasm for social network analysis is solidifying the merit of bibliometrics in both breadth and depth. Therefore, what follows is a review of how bibliometrics has been used to analyze emerging technologies and an overview of network analytics and its bibliometric indicators. Furthermore, we discuss the theories and concepts of GPTs and
Methodology
An overview of the EGPT framework is given in Fig. 2. As illustrated, the methodology involves three key phases: data and pre-processing, bi-layer network analytics, and validation measurements.
Results
Our chosen discipline for this case study is information science. The choice to analyze one discipline may seem unusual given that we are, at least in part, testing the methodology's efficacy at predicting technologies that span a broad range of disciplines. Our reasoning here is that information science has, for some time, been a spearhead for cross-disciplinary research. Information science connects fundamental studies, such as mathematics, physics, and computer science, with the real-world
Discussion and conclusions
In this paper, we presented a methodology for characterizing EGPTs based on bi-layer network analytics. We defined three indicators to quantify the impact of EGPTs and applied a refined link prediction approach based on weighted resource allocation to reveal emergence. A comparison between the ranked terms in the current network and the predicted network reveals candidate EGPTs for further analysis by experts and/or an empirical review of the literature. We incorporated both types of analyses
Author contributions
Yi Zhang: Conceived and designed the analysis, Collected the data, Contributed data or analysis tools, Performed the analysis, Wrote the paper. Mengjia Wu: Conceived and designed the analysis, Collected the data, Contributed data or analysis tools, Performed the analysis, Wrote the paper. Wen Miao: Conceived and designed the analysis, Collected the data, Contributed data or analysis tools, Wrote the paper. Lu Huang: Conceived and designed the analysis, Wrote the paper. Jie Lu: Conceived and
Acknowledgments
An early version of this work was published in the Proceedings of the 2019 International Conference of the International Society for Scientometrics and Informetrics (Zhang et al., 2019).
This work was supported by the Australian Research Council under Discovery Early Career Researcher Award DE190100994 and the National Nature Science Foundation of China under Grant 71774013.
References (75)
- et al.
Network position and tourism firms' co-branding practice
Journal of Business Research
(2015) Patents and the measurement of technological change: A survey of the literature
Research Policy
(1987)- et al.
General purpose technologies “Engines of growth”?
Journal of Econometrics
(1995) - et al.
Early detection of valuable patents using a deep learning model: Case of semiconductor industry
Technological Forecasting and Social Change
(2020) Scientific collaboration and endorsement: Network analysis of coauthorship and citation networks
Journal of Informetrics
(2011)An introduction to ROC analysis
Pattern Recognition Letters
(2006)Centrality in social networks conceptual clarification
Social Networks
(1978)- et al.
Identifying the evolutionary process of emerging technologies: A chronological network analysis of World Wide Web conference sessions
Technological Forecasting and Social Change
(2015) - et al.
Subject–action–object-based morphology analysis for determining the direction of technological change
Technological Forecasting and Social Change
(2016) - et al.
General purpose technologies
(2005)