Combining advanced computational social science and graph theoretic techniques to reveal adversarial information operations

https://doi.org/10.1016/j.ipm.2020.102385Get rights and content

Highlights

  • Integrating traditional centrality and spectral modularity methods to explore intensive focal structures in social networks.

  • Applying decomposition optimization method to maximize the individual centrality measure and the network modularity value in the network level in complex social networks.

  • Measuring the influence generated by each focal structure on the individual level and the network level.

  • Applying the DCFM to measure the power generated by each focal structure.

  • Applying Newman-Girvan modularity method and depth-first search and linear graph algorithm to validate our results.

  • Evaluating model performance- apply to different social and real-world networks.

  • Implementing a toy example as a complexity analysis, and a real-world Twitter network.

  • Applying different centrality methods (degree, betweenness, closeness, eigenvector) and compare their results.

Abstract

Social media has influenced socio-political aspects of many societies around the world. It is an effortless way for people to enhance their communication, connect with like-minded people, and share ideas. Online social networks (OSNs) can be used for noble causes by bringing together communities with common shared interests and to promote awareness of various causes. However, there is a dark side to the use of OSNs. OSNs can also be used as a coordination and amplification platform for attacks. For instance, adversaries can increase the impact of an attack by causing panic in an area by promoting attacks using OSNs. Public data can help adversaries to determine the best timing for attacks, scheduling attacks, and then using OSNs to coordinate attacks on networks or physical locations. This convergence of the cyber and physical worlds is known as cybernetics. In this paper, we introduce an integrated method to identify malicious behavior and the actors responsible for propagating this behavior via online social networks. Throughout history we have used surveillance techniques to monitor negative behavior, activities, and information. Quantitative socio-technical methods such as deviant cyber flash mob (DCFM) detection and focal structure analysis (FSA) can provide reconnaissance capabilities that enable cities and governments to look beyond internal data and identify threats based on active events. Groups of powerful hackers can be identified through FSA which is an integrated model that uses a betweenness centrality method at the node-level and spectral modularity at group-level to identify a hidden malicious and powerful focal structure (a subset of the network). Assessment of groups using DCFM methods can help to identify powerful actors and prevent attacks. In this study, we examine multiple data sets integrating the DCFM and FSA models to help cybersecurity experts see a better picture of the threat which will help to plan a better response.

Introduction

Social Media is characterized as one of the powerful engines for online interaction and information exchange (Shu, Sliva, Wang, Tand & Liu, 2017) that allow access to millions of people on social media platforms (Dale, 2017). People use online social network (OSN) platforms such as Facebook, Twitter, and Instagram to communicate with their relatives, friends, and co-workers aiming to share ideas, information, and daily activities. Also, online social networks are used in large cities as a way to monitor traffic congestion, deliver online training sessions and enhance public services such as reporting broken water lines, hazardous road conditions, and other governmental services enhancements (Lorenzi et al., 2013). However, since OSN platforms are easy to use and offer free access to millions of people, this environment has also reshaped the lenses through which these platforms are viewed and given birth to new dark information operations such as fake news, misinformation, disinformation, online anti-government campaigns, anti-corruption campaigns, and political election campaigns.

Today, Facebook, Twitter, Instagram and other social platforms are common tools used for disseminating conspiracy theories, spreading radical ideology, organizing cyber flash mobs, and many other hateful actions that can take place. In addition, many malicious groups and users are misusing these platforms and turning them into tools of influence with many negative societal impacts. The following are examples of the dark side of information operations spread by malicious users or groups of users on online social networks: malicious users using OSNs to influence the public's political decisions, encouraging anti-government protests, performing cyberattacks, spreading fake news, attacking smart infrastructure networks, interrupting/stopping transportation systems, shutting down education institutions, cyber operations, organizing protests that shut down administration buildings, and encouraging young generations to accept/follow their radical agendas. All of these negative behaviors are part of a shift in how online information is viewed and the malicious users’ abilities to impact millions of honest online users. For example, Facebook was used to motivate millions of online users to participate in the Egyptian revolution in 2009 (Şen, Wigand, Agarwal, Tokdemir & Kasprzyk, 2016). In another example, a Twitter network was used to spread information about Saudi Women driving activities in 2011(Şen et al., 2016). Further, a YouTube channel was used to spread a conspiracy theory about the South China Sea conflict in 2016 (Alassad, Hussain & Agarwal, 2019), and a video channel on YouTube was used by malicious commenters to spread radical information about other shared videos (Alassad, Agarwal & Hussain, 2019). Finally, there are many other recent movements such as the “Yellow Vest Movement”, “Hong Kong Protest”, and the “Iraqi Protest in October 2019″ that were managed, directed, and controlled by online social media platforms.

In this paper, we discuss several real-world scenarios that focus on the negative impacts generated by the dark side of online information behavior. An interesting approach to viewing the scenarios mentioned in this paper is shown in Fig. 1 where the core of this environment begins with online social networks and extends to the millions of people having access to those platforms. The next layer considers the tactics, techniques, and procedures used to spread negative information in different sectors of the environment. Finally, the outer layer is the resulting impact of the negative information operations spread by malicious users. One scenario may occur when malicious users are coordinating to spread negative information about a government's poor services and spread their agendas to attack government infrastructure networks such as transportation systems, closing important highways, bridges, big airports, and subways in big cities. A second scenario is when online malicious users begin to spread negative information complaining about the education system, spreading their agendas to influence their followers for support to shut down universities, schools, and colleges in different parts of a city. All of these scenarios and many others are part of the dark side of the online information. The well-known social platforms are employed as weaponized information by malicious users to create unstable economics, politics, security, or social well-being. Fig. 1 summarizes the negative impacts on infrastructure networks in cities and municipalities generated by malicious users on social media platforms. Malicious users can conduct malicious activities on different platforms to coordinate multiple attacks on each or all parts of a smart cities’ important asset(s) at any time.

Investigating the spread of adversarial information in online networks is similar to investigating hidden malicious groups spreading conspiracy theories in social networks, where traditional clustering methods are not effective to identify them (Şen et al., 2016). The challenges with these methods will be discussed further in Sections 2 and 3. Currently, there are also no clear systematic methods, solutions and strategies to identify and quarantine the dark information or stop malicious actions conducted by coordinating users. One of the random strategies used to stop such behaviors is suspending arbitrary central users or shutting down the entire Internet service in big cities. However, such actions are always followed by negative consequences on thousands of users. It is nearly impossible to analyze all information spread across a complex network, identify central users' actions, or track their connections in complex social networks. In addition, cutting off the Internet service in big cities could be the worse solution, given this solution would impact millions of lives, impact communications networks, transportation systems, and smart grids. Such a drastic solution would also likely result in a significant economic loss that could damage the economy, have potential impacts to security and law enforcement, and many other smart infrastructure networks connected to the Internet as mentioned in Fig. 1.

In recent years many robust quantitative approaches have been applied to analyze user metrics in complex social networks. The most common methods are the centrality methods such as degree, betweenness, closeness, and eigenvector centrality methods at the individual level (Zafarani, Abbasi & Liu, 2014). Also, methods such as modularity helps to characterize the community structure at the network or group level (Zafarani et al., 2014). However, both of these traditional methods lack a method for identifying active hidden groups within a complex network as explained below (Şen et al., 2016).

To overcome the limitations in traditional community detection methods, this study proposes a novel systematic method that considers individual-based community detection algorithms using the betweenness centrality method (Freeman, 1978; Zafarani et al., 2014) together with the group-based community detection algorithms using the spectral modularity method (Tsung, Ho, Chou, Lin & Lee, 2017) and deviant cyber flash mob power calculation to identify intensive hidden sets of users in complex social networks. However, considering these two community detection categories alone would lack the depth and insight into finding the most influential malicious sets of users and network connections that would maximize the spread of negative information in online social networks. Therefore, we propose an integrated model, developing the individual-level measure which considers the users’ betweenness centrality value, and the group-level measure utilizing the spectral modularity method employed to measure the groups’ influence in the entire network.

Fig. 2 shows the proposed systematic structure and how it would overcome the limitations of both user-based, group-based community detection algorithms and utilize the deviant cyber flash mobs method to verify the outcomes. The result is a Bi-level model where we refer to Focal Structures of the Deviant Cyber Flash Mobs, and where the resultant focal structures cannot be discovered by regular community detection algorithms alone (Şen et al., 2016). The model identifies sets of users hidden within the network as active online groups who are able to maximize the negative information spread in the entire network.

The model in this research includes different contributions such as overcoming the limitations in traditional community detection algorithms by introducing the betweenness-modularity model shown in Fig. 2. Contributions from this bi-level model also include systematic models to integrate the traditional methods such as betweenness centrality method in the first level individual-based analysis and the traditional modularity method in the group-based analysis in the second level and DCFM in the third level of the analysis. The model also utilizes small-world metrics to evaluate the identified focal structures and then evaluate them using the deviant cyber flash mob detection (DCFM) method to determine if the users and groups are able to operate in the network. Presenting a novel way to illustrate the hub and spoke strategy used by terrorist groups is the next contribution in this research. Also, in this research we evaluated the proposed model's performance by comparing the betweenness-modularity model to the other centrality methods such as closeness, centrality, and eigenvector methods. The final contribution is to propose an effective optimal mechanism and strategy in finding the intensive groups within a network, measure each set's influence on both individuals and community levels, then suspend these malicious sets of coordinating users, thereby stopping the spread of negative information throughout the online social networks.

The rest of the paper is organized as follows. Section 2 is about the research motivations and the problem definition. Section 3 discusses the related works. In Section 4, we summarize the data sets used in this paper. Section 5 is the research methodology. A toy case study is reviewed in Section 6 and a real-world social network case study is implemented in Section 7. Section 8 is to validate the results and model performance. Section 9 summarizes the research, findings, and the future works.

Section snippets

Research motivation and challenge

The main objective in this research is to introduce a systematic procedure to integrate complex methods to cluster hidden groups in social network analysis. This research proposes a way to investigate, identify, and suspend malicious groups spreading negative information in online social networks. These groups are responsible for dissemination of conspiracy theories and negative information spread to the different parts of a network. The identified groups are also able to control information

Literature review

We classify the related works into two main types, namely, community detection and focal structure analysis. We also provide a cursory review of misinformation, disinformation and online fake news as the dark side of online information behavior.

Dataset

In this research, we use two real networks of users posting radical messages on Twitter and YouTube to paralyze daily life and influence other users in complex networks. These malicious users could be responsible for organizing multiple cyberattacks to maximize damage to the network, spread fake news to convince their followers to participate in or create their own campaigns in different locations.

Systems design in social networks

Users’ behavior on online social media is reflected in their activities, interests, and social behaviors. Information using objects such as pictures and video posts that have evidence using facts can be considered truthful and is defined to be accurate (Shu et al., 2017). However, malicious users occupy online platforms to spread negative information such as (fake news, misinformation, disinformation) related to their agendas, political, and marketing gains without true evidence. The essential

Results

In this section, we employ a small social network explaining the complexity of the analysis, utilizing a fake news YouTube channel dataset described in Section 4.1.

Twitter network scenario - Resolving the negative influence in ISIS network

To measure the model's effectiveness on real-world datasets, we applied the above steps to different social media networks as shown in Table 4. For this section, we analyze the ISIS network shown in Fig. 4, where the use of the Deviant Cyber Flash Mobs (DCFM) model developed by Al-khateeb and Agarwal (Al-khateeb & Agarwal, 2014) was employed to calculate the sets’ network power, interest, and control. The integration of the DCFM model into the analysis will be used as an additional supportive

Validation-operation level on ISIS network

The validation process explained in Fig. 8 includes the results from three different methods, where the first and second validation processes are the modularity method and Depth-First Search method, and the third method is to measure the sets’ power, interest and control by DCFM method. This part of the procedure provides robust support to the model's findings from both the user-level and network-level, where the user-level only investigates the users’ aspects, where the model worked to find

Conclusion

In this paper, we explained the integrated bi-level max-max model consisting of the user-level analysis (betweenness centrality) and the group-level analysis (spectral modularity) to identify malicious sets of users in a Twitter ISIS network. This model was able to identify focal sets of malicious users hidden in the network; users are able to communicate, acting in different groups, and capable to interact with maximum number of users in the network. A novel model proposed in this paper to

Author statement

We like to resubmit the revised manuscript titled “Combining Advanced Computational Social Science and Graph Theoretic Techniques to Reveal Adversarial Information Operations “ for publication in journal of dark side of online information behavior. Thank you in advance for any consideration given this manuscript.

Acknowledgements

This research is funded in part by the U.S. National Science Foundation (OIA-1946391, OIA-1920920, IIS-1636933, ACI-1429160, and IIS-1110868), U.S. Office of Naval Research (N00014–10–1–0091, N00014–14–1–0489, N00014–15-P-1187, N00014–16–1–2016, N00014–16–1–2412, N00014–17–1–2675, N00014–17–1–2605, N68335–19-C-0359, N00014–19–1–2336, N68335–20-C-0540), U.S. Air Force Research Lab, U.S. Army Research Office (W911NF-17-S-0002, W911NF-16–1–0189), U.S. Defense Advanced Research Projects Agency

References (62)

  • G. Wang et al.

    Measure of centrality based on modularity matrix

    Progress in Natural Science

    (2008)
  • F. Zou et al.

    Inverse modelling-based multi-objective evolutionary algorithm with decomposition for community detection in complex networks

    Physica A: Statistical Mechanics and Its Applications

    (2019)
  • N. Agarwal et al.

    Identifying the Influential Bloggers in a Community

  • N. Agarwal et al.

    Modeling blogger influence in a community

    Social Network Analysis and Mining

    (2012)
  • M. Alassad et al.

    Examining Intensive Groups in YouTube Commenter Networks

  • M. Alassad et al.

    Finding Fake News Key Spreaders in Complex Social Networks by Using Bi-Level Decomposition Optimization Method

  • S. Al-khateeb et al.

    Modeling flash mobs in cybernetic space: Evaluating threats of emerging socio-technical behaviors to human security

    Proceedings - 2014 IEEE Joint Intelligence and Security Informatics Conference, JISIC 2014

    (2014)
  • S. Al-khateeb et al.

    Deviance in social media and social cyber forensics uncovering hidden relations using open source information (OSINF)

    (2019)
  • A. Al-Rubaye et al.

    Extracting Social Structures from Conversations in Twitter

  • Barrenas, F., Chavali, S., Holme, P., Mobini, R., & Benson, M. (2009). Network measures. 1–4....
  • V.D. Blondel et al.

    Fast unfolding of communities in large networks

    Journal of Statistical Mechanics: Theory and Experiment

    (2008)
  • E.J. Briscoe et al.

    Determining credibility from social network structure

  • W. Chen et al.

    Efficient Influence Maximization in Social Networks Categories and Subject Descriptors

  • T.-.S. Chua

    The Multimedia Challenges in Social Media Analytics

  • A. Clauset et al.

    Finding community structure in very large networks

    Cond-Mat/0408187

    (2004)
  • R. Dale

    NLP in a post-truth world

    Natural Language Engineering

    (2017)
  • K. Faust et al.

    Social network analysis

    (1994)
  • L.C. Freeman

    A Set of Measures of Centrality Based on Betweenness

    Sociometry

    (1977)
  • M. Girvan et al.

    Community structure in social and biological networks

    Pnas

    (2002)
  • M. Gregori et al.

    Comparing operational terrorist networks

    Trends in Organized Crime

    (2020)
  • L. Hagen et al.

    New Spectral Methods for Ratio Cut Partitioning and Clustering

    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

    (1992)
  • Cited by (15)

    • Identification of Chinese dark jargons in Telegram underground markets using context-oriented and linguistic features

      2022, Information Processing and Management
      Citation Excerpt :

      According to Lusthaus (2019), communication channels that are used by these cybercriminals can be divided into four levels: (1) the top layer, which are the most open forums and marketplaces, e.g. Dark Web; (2) the middle layer of more closely vetted forums; (3) the bottom layer of even smaller and more closed groupings; (4) the molten core, which is centered on the offline organization of cybercrimes. Specifically, public social networks and messaging software also have a dark side of being used as a platform for cybercrimes (Alassad, Spann, & Agarwal, 2021). As one of the most popular communication channels on the top layer, Telegram is an open-source, cross-platform IM software.

    • Coordinated inauthentic behavior and information spreading on Twitter

      2022, Decision Support Systems
      Citation Excerpt :

      Obviously, many other combinations of coordination, inauthenticity and harmfulness are possible. To detect coordinated behaviors, existing works either leveraged similarities in user behaviors analyzed via network science frameworks [8,19,20,27–30], or temporal synchronicity between user actions [25,31–33]. The former approach typically includes analytical steps such as the construction of a weighted user-similarity network, the filtering of such network, the detection of the different coordinated communities in the network, and the study of their extent and patterns of coordination [34].

    • i-Dataquest: A heterogeneous information retrieval tool using data graph for the manufacturing industry

      2021, Computers in Industry
      Citation Excerpt :

      Whereas, graph-oriented databases are typically used to model networks and explore relationships between entities (Angles and Gutierrez, 2008). Among its major uses, there is the analysis of social networks, which allows, for example, the detection (Li et al., 2020) and tracking (Dakiche et al., 2019) of communities, but also the personalized recommendation (Zhou and Han, 2019) and also the detection of malicious behaviour (Alassad et al., 2021). However, the use of graph-oriented data models is not limited to the perimeter of social networks.

    • Unified Logic Maze Generation Using Network Science

      2024, Studies in Computational Intelligence
    View all citing articles on Scopus
    View full text