Survey
Multilayer network simplification: Approaches, models and methods

https://doi.org/10.1016/j.cosrev.2020.100246Get rights and content

Abstract

Multilayer networks have been widely used to represent and analyze systems of interconnected entities where both the entities and their connections can be of different types. However, real multilayer networks can be difficult to analyze because of irrelevant information, such as layers not related to the objective of the analysis, because of their size, or because traditional methods defined to analyze simple networks do not have a straightforward extension able to handle multiple layers. Therefore, a number of methods have been devised in the literature to simplify multilayer networks with the objective of improving our ability to analyze them. In this article we provide a unified and practical taxonomy of existing simplification approaches, and we identify categories of multilayer network simplification methods that are still underdeveloped, as well as emerging trends.

Introduction

The network analysis and mining research field has raised in popularity in the last two decades, thanks to the ability of networks of representing a wide range of real-life phenomena from physical to biological and social systems, from scientific to financial data, transportation routes, and many more. In this regard, the multilayer network model is widely used as a powerful tool to represent the organization and relationships of complex data in many domains. Multilayer networks, which initially gained momentum in social computing [1], are designed to provide a more realistic representation of the different and heterogeneous relations that may characterize an entity in the network system. For instance, a multilayer network enables an expressive way to model different types of social relations among the same set of individuals, where layers correspond to different on-line as well as off-line relations (e.g., following, co-authorship, co-working relations, and so on).

However, as we already witnessed at the beginning of the data mining era, the availability of huge amounts of complex network data represents an invaluable potential but also inevitably leads to processing issues. Just think of the number of monthly active users for the main online social networks, which is, at the time of writing, around 335 millions for Twitter1 and more than 2 billions for Facebook.2 Modeling these networks in their entirety for analysis purposes becomes unfeasible in most cases, and focusing on limited portions of the network (e.g., related to specific phenomena or geographical areas) is likely to cause problems in the boundary specification [2], [3], i.e., the choice of which entities and relations should be included in the data. Moreover, when dealing with multilayer networks, the boundary specification problem is even amplified: in fact, we can recognize a horizontal boundary specification problem for each layer similar to the one observed for single-layer networks, that is, the choice of which actors to include in the network, and a vertical boundary specification problem [1], i.e., the problem of choosing which types of relations should be represented in the network (i.e., how many layers and with which semantics). Given these premises, it is easy to understand how most network data modeled upon real-world phenomena may be incomplete and/or noisy: in fact, relations that are supposed to be central for a specific analysis task may be missing, or hidden under a considerable amount of irrelevant information. In certain cases, the existence of the relations and their strength may not even be possible to determine with certainty, leading to probabilistic representations [4].

Several network processing techniques have been proposed to partially overcome the above problems in order to enable complex analysis tasks on very large networks. Our goal in this work is to bring order to the existing literature on approaches, models and methods for simplification tasks in multilayer networks. With the term “simplification” here we refer to a specific type of network manipulation that aims at simplifying the structure of a network. We deliberately utilize the term with quite a broad meaning, which anyway does not coincide, hence should not be confused, with the mechanism of mapping multiple edges to single edges and removing self-loops. Rather, the choice of such a broad term derives from the observation that although a significantly large amount of techniques that may be described as simplification ones have been proposed in the literature, most of them were designed to solve specific problems in different domains; by contrast, nowadays we recognize a clear need to systematize these techniques in the context of complex network data, with emphasis on the multilayer network model, yet regardless of the peculiarities of a particular application domain. Network simplification can be seen as a special case of manipulation, which also includes other tasks such as perturbation and refinement. The former includes techniques designed for altering information encoded in a network, generally for privacy reasons (such as obfuscation and encryption techniques), whereas the latter refers to methods that are conceived to infer missing relations or attributes, or to correct the information encoded in a network (e.g., based on ontological facts).

We identify three broad categories of network simplification: selection, aggregation, and transformation. Selection methods operate on a multilayer network to reduce its size by filtering or sampling subsets of nodes, edges and/or layers, according to specific features of the entities involved or predefined model characteristics to preserve. Aggregation refers to various approaches to define partitional or hierarchical grouping mechanisms that involve nodes, edges or layers such as layer-based flattening, coarsening, summarization, community detection, and positional equivalence. Transformation approaches are divided into projection and graph embedding methods. Projection methods are designed to deal with different node (entity) types in a network, and aim to replace nodes of selected types with relations. Finally, graph embedding techniques aim to transform a graph into a low-dimensional, vectorial representation, which is also key enabling for machine and deep learning tasks.

Motivations for performing a simplification task on a multilayer network are manifold and often they are raised from different requirements in the target application domain. In this regard, we can recognize the following computational aspects for which a network simplification task can be beneficial:

  • By solving noise or incompleteness issues in a complex network, the relevant information contained in the network will more easily be unveiled, leading to improved data quality. This is expected to have a beneficial impact on the effectiveness of methods to be applied for further analysis tasks.

  • Simplifying a complex network can lead to improved performance of further computational analysis methods which may struggle with efficiently handling very large networks.

  • Simplifying a complex network can also enable application of an existing method originally conceived for simple (i.e., monoplex) networks, or can aid to cope with model compatibility issues when it is not possible to apply a selected method on a given network model.

Contributions. In this work, we provide the first conceptualization of the network simplification problem for multilayer networks, for which we recognize and formally define three main categories. According to this classification, we propose a formal systematization of approaches, models and methods related to network simplification tasks.

One major goal of our work is to discuss how simplification approaches that were conceived for simple networks could only be extended, adapted, or redefined to deal with multilayer networks. In this regard, whenever there is a lack in recent literature to support pursuing the above goal, we eventually try to hint at methodological solutions for specific classes of simplification techniques for multilayer networks.

Limitations and scope. In this work, we will focus on a topology-driven multilayer network model, therefore we will leave out of consideration techniques that are designed to deal with node and/or edge attributes, such as reduction of the number of attributes associated to nodes/edges in a network (e.g., feature selection methods), or reduction of the cardinality of the value set for a certain attribute (e.g., discretization, binning). We consider the above techniques closer to a traditional data mining scenario than to a network mining one, and they are often domain-specific. More details on feature selection, discretization and other methods focusing on attribute values rather than network structure can be found in most data mining and machine learning textbooks.

It should be noted that, although the focus of this work is on multilayer networks, we will also discuss how simplification techniques that are not originally conceived for multilayer networks can be applied to such networks. In this respect, we refer the reader to more focused surveys that cover one or more topics related to the ones discussed in this work but referring to single-layer networks only. For instance, Liu et al. [5] overview methodologies for static and dynamic graph summarization, which can also support related tasks, such as compression and clustering; Beck et al. [6] provides a comprehensive survey on visualization of dynamic graphs, which has also attracted increasing interest from different research communities.

Plan of this paper. The rest of the paper is organized as follows. We provide formal definitions for each of the three network simplification categories in Section 2. Accordingly, in Section 3 we classify existing methods in the literature in the context of our taxonomy and provide an overview of the main methods for each category, so that the readers can use this article to identify potentially useful approaches for their simplification problems. This overview of the literature allows us to identify categories of multilayer network simplification methods that are still underdeveloped, as well as emerging trends, as a starting point for future research. A forward-looking discussion of these and other general aspects emerging from our classification and literature review is presented in Section 4. Moreover, we review the available software implementations of methods for multilayer network analysis with emphasis on network simplification. One of the objectives of this article is indeed to boost the integration of individual methods into more general libraries and frameworks, to make them more easily usable and extensible. Finally, in Section 5, we sum up the limitations of existing methods for multilayer network simplification, and draw several pointers for future research.

Section snippets

Definitions of network simplification

Given a set of actors A and a set of layers L, a multilayer network is defined as a quadruple G=(A,L,V,E) where (V,E) is a graph, VA×L and EV×V. Each actor must be present in at least one layer, but each layer is not required to contain all actors. Each node in one layer could be linked to nodes corresponding to the same actor in a few or all other layers; in the multiplex setting, the inter-layer links only connect the same actor in different layers.

In the following, we provide a

Tidying up network simplification literature

In this section we elaborate on each of the previously presented network simplification categories and relating methods existing in the literature.

Fig. 5 shows our hierarchy of categories and subcategories of simplification techniques. Moreover, as a guide to our discussion, Table 2 reports on main characteristics of the approaches developed for network simplification, organized according to the above provided categorization. For each method, the table shows: the type of information which is

Discussion

Building upon our analysis in the previous sections, here we provide a few remarks that are concerned with the following two questions: (RQ1) What are the main characteristics that would make a given approach appropriate or not for a given simplification task? and, (RQ2) How is research on network simplification going to evolve, given the existing corpus of simplification methods?

Concerning RQ1, we will focus on practical usability criteria to determine whether a simplification approach is

Research directions

In relation to RQ2 stated in the previous section, here we focus on the future evolution of multilayer network simplification, which is still in its infancy. We shall identify underrepresented categories of simplification methods for multilayer networks. We will highlight the most evident limitations of the existing methods, for each of the categories, and raise the emergence of novel classes of methods for enhancing network simplification tasks.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (143)

  • LuczkovichJoseph J. et al.

    Defining and measuring trophic role similarity in food webs using regular equivalence

    J. Theoret. Biol.

    (2003)
  • DoreianPatrick et al.

    Generalized blockmodeling of two-mode network data

    Social Networks

    (2004)
  • ŽibernaAleš

    Blockmodeling of multilevel networks

    Social Networks

    (2014)
  • FederTomas et al.

    Clique partitions, graph compression and speeding-up algorithms

    J. Comput. System Sci.

    (1995)
  • ManethSebastian et al.

    Grammar-based graph compression

    Inf. Syst.

    (2018)
  • SeierstadCathrine et al.

    For the few not the many? The effects of affirmative action on presence, prominence, and social capital of women directors in Norway

    Scand. J. Manag.

    (2011)
  • PadrónBenigno et al.

    Alternative approaches of transforming bimodal into unimodal mutualistic networks. the usefulness of preserving weighted information

    Basic Appl. Ecol.

    (2011)
  • OpsahlTore

    Triadic closure in two-mode networks: Redefining the global and local clustering coefficients

    Social Networks

    (2013)
  • DickisonM.E. et al.

    Multilayer Social Networks

    (2016)
  • LaumannEdward O. et al.

    The boundary specification problem in network analysis

    Appl. Netw. Anal.

    (1983)
  • ParchasPanos et al.

    Uncertain graph processing through representative instances

    ACM Trans. Database Syst.

    (2015)
  • LiuYike et al.

    Graph summarization methods and applications: A survey

    ACM Comput. Surv.

    (2018)
  • BeckFabian et al.

    A taxonomy and survey of dynamic graph visualization

    Comput. Graph. Forum

    (2017)
  • Piotr Bródka, Krzysztof Skibicki, Przemyslaw Kazienko, Katarzyna Musial, A degree centrality in multi-layered social...
  • Albert Solé-Ribalta, Manlio De Domenico, Sergio Gómez, Alex Arenas, Centrality rankings in multiplex networks, in:...
  • Tanmoy Chakraborty, Ramasuri Narayanam, Cross-layer betweenness centrality in multiplex networks with applications, in:...
  • BattistonFederico et al.

    Efficient exploration of multiplex networks

    New J. Phys.

    (2016)
  • DomenicoManlio De et al.

    Navigability of interconnected networks under random failures

    Proc. Natl. Acad. Sci. USA

    (2014)
  • Sude Tavassoli, Katharina A. Zweig, Most central or least central? How much modeling decisions influence a node’s...
  • GalimbertiEdoardo et al.

    Core decomposition and densest subgraph in multilayer networks

  • Jacob D. Moorman, Qinyi Chen, Thomas K. Tu, Zachary M. Boyd, Andrea L. Bertozzi, Filtering methods for subgraph...
  • BasarasPavlos et al.

    Identifying influential spreaders in complex multilayer networks: A centrality perspective

    IEEE Trans. Netw. Sci. Eng.

    (2019)
  • SerranoM. Ángeles et al.

    Extracting the multiscale backbone of complex weighted networks

    Proc. Natl. Acad. Sci.

    (2009)
  • RadicchiFilippo et al.

    Information filtering in complex weighted networks

    Phys. Rev. E

    (2011)
  • MastrandreaRossana et al.

    Enhanced reconstruction of weighted networks from strengths and degrees

    New J. Phys.

    (2014)
  • DianatiNavid

    Unwinding the hairball graph: Pruning algorithms for weighted complex networks

    Phys. Rev. E

    (2016)
  • SquartiniTiziano et al.

    Unbiased sampling of network ensembles

    New J. Phys.

    (2015)
  • GemmettoValerio et al.

    Irreducible network backbones: unbiased graph filtering via maximum entropy

    (2017)
  • Giona Casiraghi, Vahan Nanumyan, Ingo Scholtes, Frank Schweitzer, From Relational data to graphs: Inferring significant...
  • Domenico Mandaglio, Alessia Amelio, Andrea Tagarelli, Consensus community detection in multilayer networks using...
  • LeeSang Hoon et al.

    Statistical properties of sampled networks

    Phys. Rev. E

    (2006)
  • Jure Leskovec, Jon Kleinberg, Christos Faloutsos, Graphs over time, in: Proc. ACM SIGKDD Int. Conf. on Knowledge...
  • Jure Leskovec, Christos Faloutsos, Sampling from large graphs, in: Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery...
  • GjokaMinas et al.

    Multigraph sampling of online social networks

    IEEE J. Sel. Areas Commun.

    (2011)
  • KhadangiEhsan et al.

    Biased sampling from facebook multilayer activity network using learning automata

    Appl. Intell.

    (2016)
  • NewmanM.E.J. et al.

    Finding and evaluating community structure in networks

    Physical Rev. E

    (2004)
  • ClausetAaron et al.

    Finding community structure in very large networks

    Phys. Rev. E

    (2004)
  • TangLei et al.

    Uncoverning groups via heterogeneous interaction analysis

  • Jungeun KimJae-Giln Lee

    Community detection in multi-layer graphs: A survey

    SIGMOD Rec.

    (2015)
  • LoeChuan W. et al.

    Comparison of communities detection algorithms for multiplex

    Physica A

    (2015)
  • Cited by (45)

    • The emergence of a core–periphery structure in evolving multilayer network

      2023, Physica A: Statistical Mechanics and its Applications
    View all citing articles on Scopus
    View full text