A temporal graph framework for intelligence extraction in social media networks

https://doi.org/10.1016/j.im.2023.103773Get rights and content

Abstract

Social media networks (SMNs) are increasingly used in professional management of knowledge workers and related assets. However, the factors affecting behavioral trends and activity levels in these networks are not well understood. Although social and cognitive theories can help to explain human behavior in traditional social networks, their application to SMNs has not been validated. Traditional social network modeling techniques may not accurately predict real-world SMN activities. This research developed a temporal graph framework for intelligence extraction in SMNs. Theory-based, data-driven models (Conformity Model (COM), Recency-Primacy Model (REM), Trend Interaction Model (TIM), Periodic Interaction Model (PIM)) were developed based on the framework to capture various aspects of user behavior: conformity effect, recency, primacy, periodicity, and dynamic trend. The models capture the activity history and dynamically combine pricing information to enhance predictive accuracy. Using data of 83,536 GitHub software repositories on cryptocurrency, this article reports the results of experiments that compare the models’ performance in predicting SMN activities over time. Experimental results show that the model (REM) that captures recency/primacy effects of human cognitive processing outperformed other models in 9 (out of 18) measures pertaining to engagement, contribution, influence, and popularity. Primacy plays a dominant role in predicting engagement, contribution, and popularity, whereas recency plays a key role in predicting influence. Short-term trend (modeled with TIM) was found to yield significantly better performance on predicting user contribution. The models also outperformed an integrated machine learning (IML) model by most measures. Overall, the effects modeled by REM and TIM were found to be more significant than the effects modeled by COM, PIM, and IML. The research contributes to enhancing understanding of SMN behavior, developing new models to simulate and predict SMN activities, and designing new artifacts for information systems practitioners to manage knowledge assets and to extract SMN intelligence.

Introduction

Flexibility and widespread adoption have made social media (SM) a powerful means of communication and collaboration in modern organizations [1]. Businesses increasingly use social media networks (SMNs) to support relationship development and network tie formation between knowledge workers and assets [2], such as software, patents, and expertise. An SMN is a digitized network of entities that represent corresponding entities connected socially in the physical world [2], [3]. For example, software developers discuss program code issues and vulnerabilities in SMNs (e.g., on GitHub) to facilitate timely resolution. Marketers promote their brands in SMNs (e.g., on Facebook) to increase customer recruitment and loyalty. Scientists participate in SMNs (e.g., on ResearchGate) to learn about innovations and patents. These SMNs produce tremendous benefits to organizations and economies [4] and are estimated to create more than $1 trillion in value because of enhanced communication and improved collaboration [1]. As more knowledge workers are engaged in SMNs, it is instructive to identify and predict activity trends from these networks for various applications such as enhancing social cybersecurity[5] and connecting heterogeneous expertise [6]. In the software development profession, it has been found that social media (SM) enables a much faster identification of software vulnerabilities (nearly 90 days earlier) than using the National Vulnerabilities Database (NVD) maintained by the US government [7]. Despite this, accurately extracting intelligence from complex, evolving SMNs over time can challenge the management and protection of valuable knowledge assets.

SMNs allow knowledge workers to interact with each other through collaborating on related knowledge assets over time. The term “knowledge asset” encompasses companies’ core competencies, areas of expertise, and deep pools of talents [6]. Increased diffusion of knowledge assets due to SMN usage is challenging the management of these assets. For example, GitHub SMNs are widely used to support collaborative software development in financial transaction using cryptocurrency, which relies on decentralized computer software programs to manage transactions and issue money. The software validates a public list of records (known as “blockchain”) to ensure security. In recent years, interest in cryptocurrency software has increased dramatically [8], [9] (e.g., Bitcoin was estimated to account for $76 billion of illegal transactions (or 46% of all its transactions) per year [10]). Prices of cryptocurrencies can vary dramatically because of illegal activities (e.g., hacking, theft) and shifting investor sentiment [11], [12]. As developers of cryptocurrency software interact in SMNs [9], identifying activity trends and extracting intelligence from the online community can be difficult, challenging the state-of-the-art technology [13]. These trends and intelligence could help managers to spot disparate areas of expertise and enhance cybersecurity. The GitHub SMNs consist of user-repository connections, in which knowledge assets are the know-how. Other examples of using SMNs in managing knowledge assets (and the type of ties) include organizational expertise networks (expert-topic), talent networks (people-job), and online review networks (item-review).

Traditional research on SMNs focuses primarily on interpersonal interactions [4] in which each node is assumed to be a human user and each link is a tie between two users. While this assumption conforms with the dominant theoretical lens for information systems (IS) research [14], it excludes the possibility of non-human nodes and ties connecting a non-human agent delegated or created by human users. Recent advances in IS and artificial intelligence (AI) have greatly facilitated human capabilities to manage knowledge assets by overcoming this “human-centric” assumption, allowing managers and developers to delegate tasks to non-human agents in SMNs. Inspired by the agent interaction theories and the IS delegation theoretical framework [15], we consider an SMN to consist of rich and dynamic interactions of participating human and non-human entities. This view allows a more flexible modeling of SMNs and supports the development of a novel framework for managing knowledge assets, as explained in the following sections.

In addition, some prior studies in social network analysis assume the number and properties of network nodes to be static (i.e., unchanged) over time [16]. By contrast, real-world SMNs seldom follow this assumption. Human behavior is often affected by information timeliness, such as recency, primary, trends and periodicity, whose behavioral effect on SMNs is not well understood. This lack of understanding makes it difficult to use knowledge assets strategically or identify cyber threats effectively (e.g., [17]). However, existing models lack the capability to accurately represent evolving SMNs (e.g., data are not modeled as evolving SMNs over time [18], [19]), such as those of cryptocurrency software development communities. Extant theories, though useful in traditional social settings, are not widely used in modeling dynamic SMNs [20].

This research developed a theoretically grounded temporal graph framework for extracting cyber intelligence in dynamic SMNs. We define “cyber intelligence” as the result of “acquisition, interpretation, collation, assessment, and exploitation of information” [21], [22] in SM and SMNs. The framework provides theoretically-based constructs to model collective behavior and temporal effects in SMNs, and supports flexible specification of reference history to learn from past behavioral data. Based on the framework, four models were developed to represent human behavior of conforming to collective norms and temporal effects of information (recency and primacy, trends, and periodicity). The models were used to predict and simulate the activities of cryptocurrency software SMNs extracted from GitHub cryptocurrency software repositories. Activity history of the SMNs and pricing information of the related cryptocurrencies are combined dynamically to enhance the accuracy of prediction and fidelity of simulation. The research seeks to answer these questions: (1) What are the factors affecting SMN behavior? (2) How can these factors be modeled to explain and predict SMN behavior? (3) How do the models perform in predicting SMN behavior of a cryptocurrency software development community?

In our experiments, the models were used to predict and simulate the SMN activities of 83,536 repositories related to three cryptocurrencies: Bitcoin, Monero, and Ethereum. Each model’s predictions were compared against the ground truth (actual data) to calculate various performance measures and metrics pertaining to contribution, influence, popularity, and engagement. The measures were then compared across 18 configurations of the models to provide insights into their performance. We believe the research contributes to (1) developing new framework and models of SMN behavior, (2) providing a new application and a three-step process of model deployment for diverse domains such as cybersecurity and knowledge management (KM), (3) increasing situation awareness on cryptocurrency software SMNs, and (4) providing a new example of SMN research in the IS field.

Section snippets

Literature Review

Dynamic social networks have been studied in diverse disciplines, including mathematics [23], physics [24], computer science [25], [26], IS [4], [27], statistics [28], sociology [29], and economics [30], among others. Various theories and methods have been developed to facilitate understanding and prediction of collective behavior, network dynamics, nodal behavior, and link structure. The prevalence of SM promotes further interest in the modeling of human behavior in dynamic SMNs, prompting the

A Temporal Graph Framework

This research developed a temporal graph framework for modeling dynamic social networks and for predicting future SMN trends and activities, with a goal to enhance the management and security of knowledge assets in SMNs. Developed based on a design science paradigm [80], the framework incorporates all the core components (influence, collective, network, and time) identified in the literature review to build predictive models of dynamic SMN behavior. Emphasis is placed on modeling networks as

Experiments on SMN Prediction

This section describes the experiments conducted to study the effectiveness of the models on predicting and simulating cryptocurrency software SMN activities. The following provides the experimental goals and design, evaluation methods, performance measures, and experimental findings.

Experimental Findings

This section reports the results of evaluating the models’ performance in accurately predicting cryptocurrency software SMN activities and discusses the implication of the findings. Overall, REM model variants achieved the best performance in 9 (out of 18) measures. The following sections provide detailed results that are grouped into four social aspects explained above.

Conclusion

A powerful and popular tool for managing professional and virtual work, SMNs enable the formation of complex networks of knowledge workers and related assets, thereby facilitating understanding of human behavior in virtual organizations. For example, modeling the dynamic SMNs of cryptocurrency software may provide cyber intelligence and increase situation awareness. However, the factors affecting behavioral trends and activity levels in these networks are not well understood. This research

CRediT authorship contribution statement

Wingyan Chung: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing, Project administration, Visualization, Supervision, Funding acquisition. Vincent S. Lai: Conceptualization, Methodology, Resources, Funding acquisition, Writing – review & editing, Project administration.

Acknowledgements

The authors thank the journal editors and reviewers for their valuable comments and input.

Dr. Wingyan Chung is Professor of Computer Science in Soules College of Business at The University of Texas at Tyler. His scholarly interests and expertise include business analytics, machine learning, social media analytics, cybersecurity, network science, data science, knowledge management, and human-computer interaction. He has published extensively in IS, CS, and scientific journals, including Journal of Management Information Systems, Information & Management, ACM Transactions on MIS,

References (95)

  • M. Ihrig et al.

    Managing your mission-critical knowledge

    Harvard Business Review

    (2015)
  • P. Shrestha et al.

    Multiple social platforms reveal actionable signals for software vulnerability awareness: A study of Github, Twitter and Reddit

    PLOS ONE

    (2020)
  • A. Cruysheer

    Bitcoin: A Look at the Past and the Future

    (2015)
  • B. Rudisail

    Hacking activities increase along with cryptocurrency pricing (2/26/2018)

    techopedia

    (2018)
  • S. Foley et al.

    Sex, drugs, and bitcoin: How much illegal activity is financed through cryptocurrencies?

    The Review of Financial Studies

    (2019)
  • P. Vigna et al.

    Bitcoin sinks after exchange reports hack (Aug. 2, 2016)

    Wall Street Journal

    (2016)
  • R. Browne, K. Rooney, Bitcoin jumps above $8,200, adding to cryptocurrency’s recovery this week (7/24/2018)(2018)....
  • D. Tayouri

    Social media as an intelligence goldmine

    Cyber Security Review

    (2016)
  • A. Burton-Jones, M. Stein, A. Mishra, MIS Quarterly Curation: IS Use,...
  • A. Baird et al.

    The next generation of research on IS use: a theoretical framework of delegation to and from agentic IS artifacts

    MIS Quarterly

    (2021)
  • P. Wang et al.

    Link prediction in social networks: the state-of-the-art

    Science China-Information Sciences

    (2015)
  • D. Liebenberg et al.

    The Illicit Cryptocurrency Mining Threat

    Technical Report

    (2018)
  • F. Mai et al.

    How does social media impact bitcoin value? A test of the silent majority hypothesis

    Journal of Management Information Systems

    (2018)
  • D. Garcia et al.

    The digital traces of bubbles: feedback cycles between socio-economic signals in the bitcoin economy

    Journal of the Royal Society Interface

    (2014)
  • C. Stage

    The online crowd: A contradiction in terms? On the potentials of Gustave Le Bon’s crowd psychology in an analysis of affective blogging

    Distinktion: Journal of Social Theory

    (2013)
  • P. Davies

    Intelligence, information technology, and information warfare

    (2002)
  • W. Chung et al.

    A visual framework for knowledge discovery on the web: An empirical study on business intelligence exploration

    Journal of Management Information Systems

    (2005)
  • P. Erdős et al.

    On random graphs

    Publicationes Mathematicae

    (1959)
  • P. Holme et al.

    Temporal networks

    Physics Reports

    (2012)
  • Y. Yasami et al.

    A statistical infinite feature cascade-based approach to anomaly detection for dynamic social networks

    Computer Communications

    (2017)
  • W. Wei et al.

    Measuring temporal patterns in dynamic social networks

    ACM Transantions on Knowledge Discovery from Data

    (2015)
  • E. Otte et al.

    Social network analysis: a powerful strategy, also for the information sciences

    Journal of information Science

    (2002)
  • T.A.B. Snijders et al.

    Introduction to stochastic actor-based models for network dynamics

    Social Networks

    (2010)
  • M.O. Jackson

    Social and Economic Networks

    (2008)
  • S. Reicher

    The psychology of crowd dynamics

    Blackwell Handbook of Social Psychology: Group processes

    (2001)
  • G. Le Bon

    The crowd: A study of the popular mind

    (1895)
  • R. Turner et al.

    Collective Behavior

    (1957)
  • F. Gino et al.

    Contagion and differentiation in unethical behavior: The effect of one bad apple on the barrel

    Psychological Science

    (2009)
  • M. McPherson et al.

    Birds of a feather: Homophily in social networks

    Annual Review of Sociology

    (2001)
  • B. Latané

    The psychology of social impact

    American Psychologist

    (1981)
  • C. Sedikides et al.

    Social impact theory: A field test of source strength, source immediacy and number of targets

    Basic and Applied Social Psychology

    (1990)
  • G.S. Becker

    A theory of social interactions

    Journal of Political Economy

    (1974)
  • R.M. Emerson

    Social exchange theory

    Annual Review of Sociology

    (1976)
  • F. Heider

    The psychology of interpersonal relations

    (1958)
  • Q. Li et al.

    Real-time novel event detection from social media

    IEEE 33rd International Conference on Data Engineering

    (2017)
  • C.I. Hovland

    The order of presentation in persuasion

    (1957)
  • N. Miller et al.

    Recency and primacy in persuasion as a function of the timing of speeches and measurements

    The Journal of Abnormal and Social Psychology

    (1959)
  • Cited by (1)

    Dr. Wingyan Chung is Professor of Computer Science in Soules College of Business at The University of Texas at Tyler. His scholarly interests and expertise include business analytics, machine learning, social media analytics, cybersecurity, network science, data science, knowledge management, and human-computer interaction. He has published extensively in IS, CS, and scientific journals, including Journal of Management Information Systems, Information & Management, ACM Transactions on MIS, Communications of the ACM, IEEE Computer, Decision Support Systems, and Scientific Reports, among others.

    Dr. Vincent S. Lai is an Emeritus Professor of MIS at The Chinese University of Hong Kong. His current research focuses on innovation assimilation, online auctions, technology imitation, virtual collaboration, and global IS strategy. His findings in these research areas have been published extensively in IS journals, including Journal of Management Information Systems, Information & Management, Communications of the ACM, Decision Support Systems, Decision Sciences, European Journal of Information Systems, and IEEE Transactions on Engineering Management, among others.

    View full text