1 Introduction

The concept of a Reeb graph of a Morse function first appeared in [16] and has subsequently been applied to problems in shape analysis in [13, 18]. The literature on Reeb graphs in the computational geometry and computational topology is ever growing (see, e.g., [2, 3] for a discussion and references). The Reeb graph plays a central role in topological data analysis, not least because of the success of Mapper [19], a data analysis method providing a discretization of the Reeb graph for a function defined on a point cloud.

A recent line of work has concentrated on questions about identifying suitable notions of distance between Reeb graphs. These include the so-called functional distortion distance [2], the interleaving distance [8], and various graph edit distances [1, 9, 11]. Naturally, there is a strong interest in understanding the connection between different existing distances. In this regard, it has been shown in [3] that the functional distortion and the interleaving distances are bi-Lipschitz equivalent. The edit distances defined in [9, 11] for Reeb graphs of curves and surfaces, respectively, are shown to be universal in their respective settings, so the functional distortion and interleaving distances restricted to the same settings are a lower bound for those distances. Moreover, an example in [9] shows that the functional distortion distance can be strictly smaller than the edit distance considered in that paper.

In this paper, we consider the setting of piecewise linear (PL) functions on compact triangulable spaces, and in this realm we study the properties of stability and universality of distances between Reeb graphs. The notion of stability has been introduced by Cohen-Steiner et al. [6] in the context of persistence diagrams and is a key property for topological descriptors [15]. Stability means that two objects at a given distance are assigned descriptors at no more than (a multiple of) that distance. This requires a notion of distance on both the collection of objects and on the collection of descriptors. The practical relevance of stability lies in the guaranteed robustness of the method with respect to bounded imprecision, caused by noise, coarse sampling, or other sources of uncertainty. However, the stability of a descriptor is not sufficient to warrant discriminativeness, i.e., the ability to distinguish different objects: a construction that assigns to every object the same descriptor is certainly stable, but contains no information. For that reason, given a fixed distance on the objects and a construction for a descriptor, it is desirable to assign to the descriptors a distance that is as large as possible while still satisfying the stability property. In that sense, such a distance is then the most discriminative stable distance. Following Lesnick [14], we call such a distance universal, noting that the concept already appears in [7] in the context of topological descriptors.

Inspired by a construction of distance between filtered spaces [15], we first construct a novel distance \({d}_{U}\) based on considering joint pullbacks of two given Reeb graphs and prove that this distance satisfies both stability and universality and is also intrinsic. Via analyzing a specific construction we then prove that neither the functional distortion nor the interleaving distances are universal. Finally, we define two edit-like additional distances between Reeb graphs that reinterpret those appearing in [1, 9, 11] and prove that both are stable and universal. As a consequence, both distances agree with \({d}_{U}\).

2 Topological Aspects of Reeb Graphs

We start by exploring some topological ideas behind the definition of Reeb graphs. Unless specified otherwise, all maps and functions considered in this paper will be assumed to be continuous.

2.1 Reeb Graphs as Quotient Spaces

The classical construction of a Reeb graph [16] is given via an equivalence relation as follows:

Definition 1

For \(f:X\rightarrow {\mathbb {R}}\) a Morse function on a compact smooth manifold, the Reeb graph of f is the quotient space \(X/{\sim }_{f}\), with \(x\sim _{f} y\) if and only if x and y belong to the same connected component of some level set \(f^{-1}(t)\) (implying \(t=f(x)=f(y)\)).

While this definition was originally considered in the setting of Morse theory, it does not make explicit use of the smooth structure, and so it can be applied quite broadly. However, some additional assumptions on the space X and the function f are necessary in order to maintain some of the characteristic properties of Reeb graphs in a generalized setting. With this motivation in mind, we revisit the definition in terms of quotient maps and functions with discrete fibers.

A quotient map \(p:X\rightarrow Y\) is a surjection such that a set U is open in Y if and only if \(p^{-1}(U)\) is open in X. In particular, by the closed map lemma, any surjection between compact Hausdorff spaces is a quotient map. A quotient map \(p:X\rightarrow Y\) is characterized by the universal property that a set map \(\varPhi : Y\rightarrow Z\) into any topological space Z is continuous if and only if \(\varPhi \circ p\) is continuous.

The motivation for considering quotient maps and functions with discrete fibers is explained by the following fact.

Proposition 1

Let \(f:X\rightarrow {\mathbb {R}}\) be a function with locally connected fibers, and let \(q:X\rightarrow X/{\sim }_{ f}\) be the canonical quotient map. Then the induced function \({{\tilde{f}}}: X/{\sim }_{ f}\rightarrow {\mathbb {R}}\) with \(f={{\tilde{f}}}\circ q\) has discrete fibers.

Proof

To see that the fibers of \({{\tilde{f}}}\) are discrete, we show that any subset S of \({{\tilde{f}}}^{-1}(t)\) is closed. Let \(T={{\tilde{f}}}^{-1}(t) \setminus S\). Then \(q^{-1}(T)\) is a disjoint union of connected components of \( f^{-1}(t)\). Since \(f^{-1}(t)\) is locally connected, each of its connected components is open in the fiber, and so \(q^{-1}(T)\) is open in \( f^{-1}(t)\), implying that \(q^{-1}(S)\) is closed in \(f^{-1}(t)\) and hence in X. Since q is a quotient map, \(q^{-1}(S)\) is closed if and only if S is closed, yielding the claim. \(\square \)

2.2 Reeb Quotient Maps and Reeb Graphs of Piecewise Linear Functions

We now define a class of quotient maps that leave Reeb graphs invariant up to isomorphism. The main goal is to provide a natural construction for lifting a function \(f:X\rightarrow {\mathbb {R}}\) to a space Y through a quotient map \(Y \rightarrow X\) in a way that yields isomorphic Reeb graphs. To this end, we will define a general notion of Reeb quotient maps and Reeb graphs.

Definition 2

A Reeb domain is a connected compact triangulable space. A Reeb quotient map is a surjective piecewise linear map of Reeb domains with connected fibers.

We remark that connectedness of Reeb domains is assumed only for the sake of simplicity (see Remark 4).

As shown in Corollary 1, Reeb domains with Reeb quotient maps constitute a subcategory of the category of triangulable spaces and piecewise linear maps.

Definition 3

A Reeb graph is a pair \((R_f, {{\tilde{f}}})\) where \(R_f\) is a Reeb domain endowed with a PL function \({{\tilde{f}}}:R_f\rightarrow {\mathbb {R}}\) with discrete fibers, called a Reeb function.

In particular, the isomorphisms between Reeb graphs are PL homeomorphisms that preserve the function values of the associated Reeb functions. While the definition does not assume this explicitly, a Reeb graph is indeed a finite topological graph (a compact triangulable space of dimension at most 1).

Proposition 2

For any Reeb graph \((R_f, {{\tilde{f}}})\), the space \(R_f\) is a finite topological graph.

Proof

By definition, \({{\tilde{f}}}\) is (simplexwise) linear for some triangulation of \(R_f\). If there were a simplex \(\sigma \) of dimension at least 2 in the triangulation of \(R_f\), then for any x in the interior of \(\sigma \), the intersection \(\sigma \cap {{\tilde{f}}}^{-1}({{\tilde{f}}}(x))\) would have to be of dimension at least 1. But this would contradict the assumption that \({{\tilde{f}}}\) has discrete fibers. \(\square \)

Definition 4

Generalizing the classical definition (Definition 1), we say that a Reeb graph \((R_f, {{\tilde{f}}})\) is a Reeb graph of \(f:X\rightarrow {\mathbb {R}}\) if there is a Reeb quotient map \(p: X \rightarrow R_f\) such that \(f={{\tilde{f}}} \circ p\).

We now proceed to prove that Reeb quotient maps are closed under composition. We start by showing that not only the fibers, but more generally all preimages of closed connected sets are connected.

Proposition 3

If \(p :X\rightarrow Y\) is a Reeb quotient map, then the preimage \(p^{-1}(K)\) of a closed connected set \(K \subseteq Y\) is connected.

Proof

Assume that K is nonempty; otherwise, the claim holds trivially. Let \(p^{-1}(K)=U\cup V\), with UV nonempty and closed in \(p^{-1}(K)\). To show that \(p^{-1}(K)\) is connected, it suffices to show that \(U\cap V\) is necessarily nonempty.

Because \(p^{-1}(K)\) is closed in X, the sets U and V are also closed in X. The images p(U) and p(V) are closed by the closed map lemma, and their union is K. By connectedness of K, their intersection is nonempty. Let \(y\in p(U)\cap p(V)\). We have

$$\begin{aligned} p^{-1}(y)=(p^{-1}(y)\cap U)\cup (p^{-1}(y)\cap V). \end{aligned}$$

The subspaces \((p^{-1}(y)\cap U)\) and \((p^{-1}(y)\cap V)\) are closed in \(p^{-1}(y)\), and by connectedness of the fiber \(p^{-1}(y)\), their intersection must be nonempty. In particular, \(U\cap V\) is nonempty. \(\square \)

Corollary 1

If \(p :X\rightarrow Y\) and \(q :Y\rightarrow Z\) are Reeb quotient maps, then the composition \(q\circ p:X\rightarrow Z\) is a Reeb quotient map too.

As mentioned before, the main purpose of Reeb quotient maps is to lift Reeb functions to larger domains while maintaining the same Reeb graph. The following property is a consequence of the above statement:

Corollary 2

Let \((R_f, {{\tilde{f}}})\) be a Reeb graph of a function \(f: X \rightarrow {\mathbb {R}}\), and let \(q: Y \rightarrow X\) be a Reeb quotient map. Then \((R_f, {{\tilde{f}}})\) is also a Reeb graph of \(f\circ q: Y \rightarrow {\mathbb {R}}\).

Proof

Let \(p: X \rightarrow R_f\) be the Reeb quotient map factoring \(f = {{\tilde{f}}} \circ p\), as in the following diagram:

figure a

Then by Corollary 1, \((R_f, {{\tilde{f}}})\) is also a Reeb graph for \(f \circ q = {{\tilde{f}}} \circ (p \circ q): Y \rightarrow {\mathbb {R}}\) via the Reeb quotient map \(p \circ q: Y \rightarrow R_f\). \(\square \)

The following lemma shows how a transformation \(g = \xi \circ f \) of a function f lifts to a Reeb quotient map \(\zeta \) between the corresponding Reeb graphs.

Lemma 1

Consider a commutative diagram

figure b

where \((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})\) are Reeb graphs, \(p_f : X \rightarrow R_f,p_g : X \rightarrow R_g\) are Reeb quotient maps, and \(\chi : {{\,\mathrm{im}\,}}f \rightarrow {{\,\mathrm{im}\,}}g\) is a PL function such that \(g = \chi \circ f\). Then \(\zeta = p_g \circ p_f^{-1}\) is a Reeb quotient map from \(R_f\) to \(R_g\).

In particular, if \(\chi \) is a PL homeomorphism, then so is \(\zeta \). Note that the definition of \(\zeta \) does not involve the function \(\chi \); the existence of \(\chi \) already ensures that \(\zeta \) is a Reeb quotient map.

Proof

Let \(x \in R_f\), and let \(t={{\tilde{f}}}(x)\). Then \(C=p_f^{-1}(x)\) is a connected component of \(f^{-1}(t)\) by the assumption that \(p_f\) is a Reeb quotient map. By commutativity, we have

$$\begin{aligned}f^{-1} \subseteq f^{-1} \circ \chi ^{-1} \circ \chi = g^{-1} \circ \chi ,\end{aligned}$$

and since C is connected, there must be a single \(y \in R_g\) with \(p_g(C)=\{y\}\). Hence, \(\zeta = p_g \circ p_f^{-1}\) is a set map. Moreover, since \(p_g\) is continuous and \(p_f\) is closed, the map \(\zeta \) is continuous; since \(p_g\) and \(p_f\) are PL, the map \(\zeta \) is PL as well.

Now let \(y\in R_g\) and let \(s={{\tilde{g}}}(y)\). Similarly to above, \(C=p_g^{-1}(y)\) is a connected component of \(g^{-1}(s)\). We have \(p_f(C) = p_f \circ p_g^{-1}(y) = \zeta ^{-1}(y) \ne \emptyset \), so \(\zeta \) is surjective, and the fiber \(\zeta ^{-1}(y) = p_f(C)\) is connected as the image of a connected set. \(\square \)

Remark 1

By Proposition 1 and Lemma 1, given a Reeb graph \((R_f, {{\tilde{f}}})\) of \(f: X \rightarrow {\mathbb {R}}\) with Reeb quotient map \(p :X \rightarrow R_f\), there is a canonical isomorphism \(R_f \cong X/{\sim }_{f}\). As a consequence, the Reeb graph \((R_f, {{\tilde{f}}})\) together with the Reeb quotient map p is unique up to a unique isomorphism, thus defining the Reeb graph as a universal property.

We now show that Reeb quotient maps are stable under pullbacks.

Proposition 4

Consider a pullback diagram of PL maps \(p_1:X_1 \rightarrow Y\), \(p_2:X_2 \rightarrow Y\):

figure c

where, as usual, \(X_1\times _{Y}X_2=\{(x_1,x_2)\in X_1\times X_2:p_1(x_1,x_2)=p_2(x_1,x_2)\}\). If the map \(p_1\) (resp. \(p_2\)) is a Reeb quotient map, then so is the map \(q_2\) (resp. \(q_1\)). Hence, the class of Reeb quotient maps is stable under pullbacks.

Proof

First note that the category of compact triangulable spaces has all pullbacks [20]. For \(x_2\in X_2\), by surjectivity of \(p_1\) there is some \(x_1\in X_1\) such that \(p_1(x_1)=p_2(x_2)\). Thus \((x_1,x_2)\in X_1\times _Y X_2\) and \(q_2(x_1,x_2)=x_2\), proving that \(q_2\) is surjective. Moreover, for \(x_2\in X_2\), we have \(q_2^{-1}(x_2)=p_1^{-1}(p_2(x_2))\times \{x_2\}\). By assumption, \(p_1^{-1}(p_2(x_2))\) is connected as a fiber of \(p_1\), implying that \(p_1^{-1}(p_2(x_2))\times \{x_2\}\) is connected. Finally, applying Proposition 3 to \(q_2\), we obtain that the pullback space \(X_1\times _{Y}X_2\) is connected. The proof for \(q_1\) is analogous. \(\square \)

3 Stable and Universal Distances

Throughout this paper, we will use the term distance to describe an extended pseudo-metric \(d: X\times X \rightarrow [0,\infty ]\) on some collection X. As usual, extended means that the distance can attain the value \(+\infty \), and pseudo refers to the fact that two elements can have null distance without coinciding. Our main goal is the introduction of a distance between Reeb graphs that is stable and universal in the following sense.

Definition 5

We say that a distance \(d_S\) between Reeb graphs is stable if and only if given any two Reeb graphs \((R_f, {{\tilde{f}}})\) and \((R_g, {{\tilde{g}}})\), for any Reeb domain X with Reeb quotient maps \(p_f: X \rightarrow R_f\) and \(p_g: X \rightarrow R_g\) we have

$$\begin{aligned} d_S((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}}))\le \Vert {{\tilde{f}}}\circ p_f-{{\tilde{g}}}\circ p_g\Vert _\infty . \end{aligned}$$
(S)

Note that stability implies that isomorphic Reeb graphs have distance 0. Indeed, an isomorphism of Reeb graphs \(\gamma :R_f\rightarrow R_g\) yields \(d_S((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})) \le \Vert {{\tilde{f}}}\circ {{\,\mathrm{id}\,}}- {{\tilde{g}}}\circ \gamma \Vert _\infty =0\).

Moreover, we say that a stable distance \(d_U\) between Reeb graphs is universal if and only if for any other stable distance \(d_S\) between Reeb graphs, we have

$$\begin{aligned} d_S((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}}))\le d_U((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})), \end{aligned}$$
(U)

for all \((R_f, {{\tilde{f}}})\) and \((R_g, {{\tilde{g}}})\).

Remark 2

By connectedness of \(R_f\) and \(R_g\), there is at least one space X together with maps \(p_f,p_g\) as needed to define the stability property: \(X=R_f\times R_g\), with \(p_f,p_g\) the canonical projections. The resulting functions \(f = {{\tilde{f}}}\circ p_f , g = {{\tilde{g}}}\circ p_g : R_f\times R_g\rightarrow {\mathbb {R}}\) then satisfy \(\Vert f-g\Vert _\infty = \max (\sup {{\tilde{f}}} - \inf {{\tilde{g}}}, \sup {{\tilde{g}}} - \inf {{\tilde{f}}})\). In particular, by compactness a stable distance for Reeb graphs is always finite.

The definition of stability yields the following universal distance.

Definition 6

For any two Reeb graphs \((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})\), let

$$\begin{aligned}{d}_{U}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})):=\inf _{p_f :R_f\leftarrow X\rightarrow R_g :p_g} \Vert {{\tilde{f}}}\circ p_f-{{\tilde{g}}}\circ p_g\Vert _\infty ,\end{aligned}$$

where the infimum is taken over all possible Reeb domains X and Reeb quotient maps \(p_f :X \rightarrow R_f\) and \(p_g :X \rightarrow R_g\), as in the following diagram:

figure d

Remark 3

The universal distance can equivalently be expressed as a quotient pseudo-metric [5, 17], satisfying the following universal property.

Let \(C_0(X,{\mathbb {R}})\) be the metric space of continuous real-valued functions on X, endowed with the metric induced by the supremum norm. Moreover, let \({\mathcal {R}}_X\) be the pseudo-metric space of Reeb graphs of such functions, endowed with the universal distance, and let \(r :C_0(X,{\mathbb {R}}) \rightarrow {\mathcal {R}}_X\) be the map sending a function \(f :X \rightarrow {\mathbb {R}}\) to its Reeb graph. Then r is non-expansive, and any other non-expansive map \(s :C_0(X,{\mathbb {R}}) \rightarrow Z\) which satisfies \(s(f) = s(g)\) whenever f and g have isomorphic Reeb graphs factors uniquely through a non-expansive map \({\mathcal {R}}_X \rightarrow Z\) as in the following commutative diagram:

figure e

Notice that stability is encoded here in the use of non-expansive maps, while maximality is encoded in the existence of the vertical map \({\mathcal {R}}_X \rightarrow Z\). This map always exists uniquely at the level of the underlying sets, so the existence condition simply translates to the statement that this map is also non-expansive.

Remark 4

The connectedness assumption for Reeb domains can be dropped by adapting the definition of the universal distance as follows. If \(R_f\) and \(R_g\) have a different number of connected components, then \({d}_{U}(R_f,R_g):=\infty \). If both \(R_f\) and \(R_g\) have n connected components so that \(R_f=\coprod _{i\in [n]} F_i\) and \(R_g=\coprod _{i\in [n]} G_i\) with each \(F_i\) and \(G_i\) connected, then

$$\begin{aligned}{d}_{U}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})):=\min _\gamma \inf _{p:F_i\leftarrow X\rightarrow G_{\gamma (i)}: q} \Vert {{\tilde{f}}}\circ p-{{\tilde{g}}}\circ q\Vert _\infty \end{aligned}$$

where \(\gamma \) varies among all permutations on n objects, \(i\in [n]\), and the infimum is taken over all possible Reeb domains X and Reeb quotient maps \(p :X \rightarrow F_i\) and \(q :X \rightarrow G_i\).

Proposition 5

The distance \({d}_{U}\) is the largest stable distance on Reeb graphs. Hence, \({d}_{U}\) is universal.

Proof

To see that \({d}_{U}\) is a distance, the only non-trivial part is showing the triangle inequality. To this end, given diagrams \(p_f:R_f\leftarrow X\rightarrow R_g: p_g\) and \(p'_g:R_g\leftarrow Y\rightarrow R_h: p_h\), we can form a pullback of the diagram \(p_g:X\rightarrow R_g\leftarrow Y:p'_g\) to obtain the diagram

figure f

where \(X\times _{R_g}Y\) is a Reeb domain and \(q_X,q_Y\) are Reeb quotient maps by Proposition 4. Defining \(f = {{\tilde{f}}}\circ p_f\circ q_X\), \(g = {{\tilde{g}}}\circ p_g\circ q_X={{\tilde{g}}}\circ p'_g\circ q_Y\), and \(h = {{\tilde{h}}}\circ p_h\circ q_Y\), we have

$$\begin{aligned} {d}_{U}((R_f, {{\tilde{f}}}),(R_h, {{\tilde{h}}})) \le \Vert f-h\Vert _\infty&\le \Vert f-g\Vert _\infty + \Vert g-h\Vert _\infty \\&= \Vert {{\tilde{f}}}\circ p_f - {{\tilde{g}}}\circ p_g\Vert _\infty + \Vert {{\tilde{g}}}\circ p'_g - {{\tilde{h}}}\circ p_h\Vert _\infty , \end{aligned}$$

where the last equality holds because \(q_X\) and \(q_Y\) are surjective. Hence

$$\begin{aligned}{d}_{U}((R_f, {{\tilde{f}}}),(R_h, {{\tilde{h}}}))\le {d}_{U}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}}))+{d}_{U}((R_g, {{\tilde{g}}}),(R_h, {{\tilde{h}}})).\end{aligned}$$

The stability of \({d}_{U}\) is immediate from its definition. Moreover, for any stable distance \(d_S\) between Reeb graphs, combining the stability of \(d_S\) and the definition of \({d}_{U}\), we obtain \(d_S\le {d}_{U}\), implying that \({d}_{U}\) is universal. \(\square \)

Corollary 3

The universal distance \({d}_{U}\) is a metric on isomorphism classes of Reeb graphs.

Proof

According to Remark 2, by stability, \({d}_{U}\) is always finite. Moreover, we recall from [8] that there exists a stable distance \(d_{I}\), the interleaving distance, which is a metric on isomorphism classes of Reeb graphs; in particular, \(d_{I}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}}))=0\) if and only if \((R_f, {{\tilde{f}}}) \cong (R_g, {{\tilde{g}}})\). By stability of \(d_{I} \) and universality of \({d}_{U}\), we have \(d_{I}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})) \le {d}_{U}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}}))\). Thus, \({d}_{U}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}}))=0\) implies that \(d_{I}(R_f,R_g)=0\) and hence \((R_f, {{\tilde{f}}}) \cong (R_g, {{\tilde{g}}})\). \(\square \)

Example 1

Consider the one-point Reeb graph \((*,c)\) endowed with the function identical to \(c\in {\mathbb {R}}.\) Then, for any Reeb graph \((R_f, {{\tilde{f}}})\), we have \({d}_{U}((R_f, {{\tilde{f}}}),(*,c)) = \Vert {\tilde{f}}-c\Vert _\infty .\)

We now establish that the universal distance is intrinsic.

A reference for the concepts that follow is [4, Section 2]. Recall that the length \(L_{d_X}(\gamma )\) of a curve \(\gamma :[0,1]\rightarrow X\) in a metric space \((X,d_X)\) is defined to be the supremum of the sum \(\sum _{i=0}^{n-1}d_X(\gamma (t_i),\gamma (t_{i+1}))\) over all \(0=t_0<t_1<\cdots <t_n=1\) and all natural numbers n. Note that one always has \(d_X(\gamma (0),\gamma (1))\le L_{d_X}(\gamma ).\)

The metric space \((X,d_X)\) is said to be intrinsic if for every pair of points \(x,y\in X\) and any \(\epsilon >0\) there exists a curve \(\gamma :[0,1]\rightarrow X\) with \(\gamma (0)=x\), \(\gamma (1)=y\), and \(L_{d_X}(\gamma )\le d_X(x,y)+\epsilon .\)

Proposition 6

The universal distance is intrinsic.

Proof

Indeed, given any pair \((R_f, {{\tilde{f}}})\) and \((R_g, {{\tilde{g}}})\) of Reeb graphs and any \(\epsilon >0\), there always exists a continuous curve \(\gamma _\epsilon \) from [0, 1] to the collection of all Reeb graphs with the properties that \(\gamma _\epsilon (0)= (R_f,{{\tilde{f}}})\), \(\gamma _\epsilon (1)= (R_g,{{\tilde{g}}})\), and such that its length \(L_{d_U}(\gamma _\epsilon )\) is at most \(d_U((R_f, {{\tilde{f}}}), (R_g, {{\tilde{g}}}))+\epsilon \).

In order to construct the curve \(\gamma _\epsilon \), we proceed as follows. First note that by Definition 6 there exists a triple \(p_f :R_f\leftarrow X\rightarrow R_g :p_g\) such that \(\Vert {\tilde{f}}\circ p_f-{\tilde{g}}\circ p_g\Vert \le d_U((R_f, {{\tilde{f}}}), (R_g, {{\tilde{g}}}))+\epsilon \). For each \(t\in [0,1]\) let \(h_t:=(1-t)\cdot {\tilde{f}}\circ p_f + t\cdot {\tilde{g}}\circ p_g\) and \(\gamma _\epsilon (t):=X/{\sim _{h_t}}.\) That the length of this curve is bounded above as desired follows immediately from the observation that for every \(s,t\in [0,1]\), \(d_U((X/{\sim _{h_s}},{h}_s),(X/{\sim _{h_t}},{h}_t))\le |s-t|\,\Vert {\tilde{f}}\circ p_f - {\tilde{g}}\circ p_g\Vert .\) \(\square \)

Remark 5

Whether \(d_U\) is a geodesic metric is not yet known. This is related to the question whether a minimizing triple \(p_f :R_f\leftarrow X\rightarrow R_g :p_g\) always exists in Definition 6: indeed, via an argument similar to the one given in the proof of Proposition 6 any such triple would permit constructing a curve joining \(R_f\) and \(R_g\) with the property that its length is exactly equal to the universal distance between its endpoints. See Sect. 6.

Example 2

We now consider an example where we can explicitly determine the value of the distance \({d}_{U}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}}))\) between two specific simple Reeb graphs \(R_{f} = {\mathbb {S}}^1=\{(x,y)\in {\mathbb {R}}^2: x^2+y^2=1\}\) with \({{\tilde{f}}}(x,y)=x\) and \(R_{g} = [-1,1]\) with \({{\tilde{g}}}(t)=t\).

Consider the cylinder \(C=\{(x,y,z)\in {\mathbb {R}}^3: x^2+y^2=1, \, |2z-x| \le 1\}\) together with functions \(f(x,y,z)=x\) and \(g(x,y,z)=z\) defined on C. Then \((R_f, {{\tilde{f}}})\) is a Reeb graph of f via the Reeb quotient map \((x,y,z) \mapsto (x,y)\), and \((R_g, {{\tilde{g}}})\) is a Reeb graph of g via the Reeb quotient map \((x,y,z) \mapsto z\), see Fig. 1.

Fig. 1
figure 1

The space C used in Example 2 (cf. Proposition 7), together with the two Reeb graphs obtained from the coordinate functions

The example demonstrates the non-universality of certain distances proposed in the literature. We prove:

Proposition 7

\({d}_{U}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}}))=1\).

Proof

First note that \(|f(c)-g(c)| \le 1\) for all \(c\in C\), implying that \({d}_{U}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})) \le 1\). To show that \({d}_{U}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})) \ge 1\), assume for a contradiction that there is a diagram

$$\begin{aligned}p_f:R_f\leftarrow Z\rightarrow R_g: p_g\end{aligned}$$

of Reeb quotient maps such that, letting \({{\hat{f}}} = {{\tilde{f}}} \circ p_f\) and \({{\hat{g}}} = {{\tilde{g}}} \circ p_g\), we have \(\Vert {{\hat{f}}} -{{\hat{g}}} \Vert _\infty = \delta < 1.\) We then observe the following:

  • \({{\hat{g}}}^{-1}(0) \subseteq {{\hat{f}}}^{-1}([-\delta ,+\delta ])\).

  • \({{\tilde{f}}}^{-1}([-\delta ,+\delta ])\) consists of two circular arcs homeomorphic by \({{\tilde{f}}}\) to \([-\delta ,+\delta ]\). Thus, by Proposition 3, \({{\hat{f}}}^{-1}([-\delta ,+\delta ]) \) consists of two connected components \(C_+\) and \(C_-\) as well.

  • For both components, we have \({{\hat{f}}}(C_\pm ) = [-\delta ,\delta ]\), and so \(\Vert {{\hat{f}}} -{{\hat{g}}} \Vert _\infty = \delta \) implies that \(0 \in {{\hat{g}}}(C_\pm )\). Thus \({{\hat{g}}}^{-1}(0)\cap C_-\ne \emptyset \) and \({{\hat{g}}}^{-1}(0)\cap C_+\ne \emptyset \).

But since \({{\hat{g}}}^{-1}(0)\subseteq C_-\sqcup C_+\), this would contradict the assumption that the fiber \({{\hat{g}}}^{-1}(0)\) is connected. \(\square \)

The current example illustrates that the functional distortion distance introduced in [2] and the interleaving distance introduced in [8] are both stable but fail to be universal. We first recall the definition of the former. For any Reeb graph \((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})\), consider the metric on \(R_f\) given by

$$\begin{aligned}d_f(x,y)=\inf \{b-a \mid x,y \text { are in the same connected component of } {{\tilde{f}}}^{-1}([a,b])\}.\end{aligned}$$

Given maps \(\phi : R_f \rightarrow R_g\) and \(\psi : R_g \rightarrow R_f\), we write

$$\begin{aligned} G(\phi ,\psi ):=\big \{(p,\phi (p)):p\in R_f\}\cup \{(\psi (q),q):q\in R_g\big \} \end{aligned}$$

for the correspondences induced by the two maps, and

$$\begin{aligned} D(\phi ,\psi ) := \sup _{(p,q),(p',q')\in G(\phi ,\psi )}\frac{1}{2}\left| d_f(p,p')-d_g(q,q')\right| \end{aligned}$$

for the metric distortion induced by \((\phi ,\psi )\). The functional distortion distance is then defined as

$$\begin{aligned} d_{FD}(R_f, R_g) := \inf _ {\phi ,\psi }( \max \big \{ D(\phi ,\psi ), \Vert f-g\circ \phi \Vert _\infty , \Vert f\circ \psi -g\Vert _\infty \big \}). \end{aligned}$$

To see that neither the functional distortion distance nor the interleaving distance are universal, we establish:

Proposition 8

\(d_{I}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})) \le d_{FD}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})) \le \frac{1}{2}.\)

Proof

By [3, Lemma 8], the functional distortion distance is an upper bound on the interleaving distance on Reeb graphs [8], and so it is enough to prove that \(d_{FD}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})) \le \frac{1}{2}.\) To this end, consider the maps

$$\begin{aligned}\phi :R_f\rightarrow R_g, ~ (x,y) \mapsto x \quad \text {and} \quad \psi :R_g\rightarrow R_f, ~ t \mapsto \left( t, \sqrt{1-t^2}\right) .\end{aligned}$$

For every pair \(p, p' \in R_f\) one can verify that

$$\begin{aligned}|{{\tilde{f}}}(p)-{{\tilde{f}}}(p')|\le d_f(p, p') \le |{{\tilde{f}}}(p)- f(p')|+1,\end{aligned}$$

while for every pair \(q, q' \in R_g\), we have

$$\begin{aligned}d_g(q, q') = |{{\tilde{g}}}(q)-{{\tilde{g}}}(q')|.\end{aligned}$$

This implies that for any two corresponding pairs \((p,q),(p',q')\in G(\phi ,\psi )\), we have

$$\begin{aligned}|d_f(p, p')-d_g(q,q')|\le 1,\end{aligned}$$

and thus \(D(\phi ,\psi ) \le \frac{1}{2}\). Both maps preserve function values, so we have \(d_{FD}(R_f,R_g) \le \frac{1}{2}\). \(\square \)

3.1 A General Lower Bound

We conclude this section by pointing out that the method used to compute a lower bound for \({d}_{U}\) in Example 2 gives rise to a general statement. Given a function \(h:Z\rightarrow {\mathbb {R}}\), let \(\beta _0^h\) be the function that takes each closed interval I of \({\mathbb {R}}\) (possibly degenerate) into the number of connected components C of \(h^{-1}(I)\) such that the interlevel-set components of h whose images under h span the whole interval I, i.e., \(h(C)=I\). If \(l:Z\rightarrow {\mathbb {R}}\) is another function on the same space Z such that, for some \(\delta >0\), \(\Vert h-l\Vert _\infty <\delta \), an argument similar to the one in Example 2 would yield that for every \(t\in {\mathbb {R}}\) one must have \(\beta _0^h([t,t])\ge \beta _0^l([t-\delta ,t+\delta ]).\) By swapping the roles of h and l, one then obtains a condition suggesting the following symmetric definition.

Given two functions \(f:X\rightarrow {\mathbb {R}}\) and \(g:Y\rightarrow {\mathbb {R}}\) define

$$\begin{aligned} \ell (\beta _0^f,\beta _0^g):=\inf \big \{ \varepsilon > 0 \mid \forall t\in {\mathbb {R}}: \beta _0^f([t,t])&\ge \beta _0^g([t-\varepsilon ,t+\varepsilon ]) \\ ~~\text {and}~~ \beta _0^g([t,t])&\ge \beta _0^f([t-\varepsilon ,t+\varepsilon ])\big \}. \end{aligned}$$

With this definition, we can now prove:

Theorem 1

For any two Reeb graphs \((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})\),

$$\begin{aligned}\ell (\beta _0^{{{\tilde{f}}}},\beta _0^{{{\tilde{g}}}})\le {d}_{U}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})) . \end{aligned}$$

Proof

By definition of \({d}_{U}\) and by the fact that for any Reeb quotient map \(p_f:Z\rightarrow R_f\) one has \(\beta _0^{{{\tilde{f}}}} = \beta _0^{{{\tilde{f}}} \circ p_f}\), it is sufficient to show that, for all Z and for all \(f,g:Z \rightarrow {\mathbb {R}}\),

$$\begin{aligned} \ell (\beta _0^{f},\beta _0^{g})\le \Vert f-g\Vert _\infty . \end{aligned}$$

Taking \(\delta := \Vert f-g\Vert _\infty \), and given \(t\in {\mathbb {R}}\), let \(C_1,\ldots , C_k\) be those components of \(g^{-1}([t-\delta ,t+\delta ])\) which entirely span \([t-\delta ,t+\delta ]\) through g: \(g(C_j)=[t-\delta ,t+\delta ]\) for \(j=1,\ldots , k\). We claim that \(f^{-1}(t)\cap C_j\ne \emptyset \) for \(j=1,\ldots , k\), which would imply that

$$\begin{aligned} \beta _0^f([t,t])\ge \beta _0^g([t-\delta ,t+\delta ]). \end{aligned}$$

To see that indeed \(f^{-1}(t)\cap C_j\ne \emptyset \), consider any point \(x_j\in g^{-1}(t)\cap C_j\). Assume that \(f(x_j)>t\) and choose any \(y_j \in g^{-1}(t-\delta )\cap C_j\), which is non-empty by definition of \(C_j\). By definition of \(\delta \), \(f(y_j)\in [t-2\delta ,t]\), yielding \(f(y_j)\le t< f(x_j)\). Hence, for any path \(\gamma \) in \(C_j\) connecting \(x_j\) to \(y_j\), \(f\circ \gamma \) attains the value t, thus proving the claim when \(f(x_j)>t\). The proof is analogous if \(f(x_j)<t\). Similarly, exchanging the role of f and g, we can prove that \(\beta _0^g([t,t])\ge \beta _0^f([t-\delta ,t+\delta ])\). \(\square \)

Remark 6

The \(\beta _0^\bullet \) functions corresponding to the spaces from Example 2 are depicted in Fig. 2.

Fig. 2
figure 2

The functions \(\beta _0^{{\tilde{f}}}\) and \(\beta _0^{{\tilde{g}}}\) corresponding to the spaces used in Example 2. An interval [ab] is represented by a point in the plane. The values of \(\beta _0^{{\tilde{f}}}\) and \(\beta _0^{{\tilde{g}}}\) are zero outside of the grey triangular areas. Note that \(\beta _0^{{\tilde{f}}}\) equals 1 for points on the dotted part of the boundary, that is, \(\beta _0^{{\tilde{f}}}(I)=1\) for intervals I of the form \([-1,b]\) for \(-1\le b\le 1\) or of the form [a, 1] for \(-1\le a\le 1\). Note that indeed \(\ell (\beta _0^{{\tilde{f}}}, \beta _0^{{\tilde{g}}})=1\)

Remark 7

Subsequently generalizing this strategy for obtaining lower bounds leads to the theory of interlevel set persistent homology. As it turns out, for the mentioned examples, the bottleneck distance of interlevel set persistent homology coincides with the bounds obtained using Theorem 1.

4 Edit Distances

Given a pair of Reeb graphs \(R_f, R_g\), consider a diagram of the form

figure g

where for \(n\in {\mathbb {N}}\) \({{\tilde{f}}}_1,\ldots ,{{\tilde{f}}}_n\) are Reeb functions with \({{\tilde{f}}}_1={{\tilde{f}}}\) and \({{\tilde{f}}}_n={{\tilde{g}}}\), and the maps \(X_i \rightarrow R_i, R_{i+1}\) for \(i=1,\ldots ,n-1\), are Reeb quotient maps. We call the diagram a Reeb zigzag diagram between \(R_f\) and \(R_g\). Observe that, by Remark 2, between any two Reeb graphs \(R_f\) and \(R_g\) there exists a Reeb zigzag diagram.

A Reeb zigzag diagram can be regarded as being composed of the following elementary diagrams:

figure h

This way, we may think of a Reeb zigzag diagram as a sequence of operations transforming \(R_f\) into \(R_g\). The elementary diagram on the left corresponds to an edit operation: the space \(X_{i-1}\), together with a function \(X_{i-1}\rightarrow {\mathbb {R}}\) with Reeb graph \(R_i\), is transformed to another space \(X_{i}\), with a function \(X_{i}\rightarrow {\mathbb {R}}\) having the same Reeb graph \(R_i\). The elementary diagram on the right corresponds to a relabel operation: the function on \(X_i\) with Reeb graph \(R_i\) is transformed to another function with Reeb graph \(R_{i+1}\). The idea of edit and relabel operations is inspired by previous work on edit distances for Reeb graphs [1, 9].

In order to define an edit distance using Reeb zigzag diagrams, we need to assign a cost to a given Reeb zigzag diagram between \(R_f\) and \(R_g\). To that end, we consider a cone from a space V by Reeb quotient maps \(V \rightarrow R_i\):

figure i

We call this diagram a Reeb cone. Any Reeb zigzag diagram admits such a cone. Indeed, the limit over the lower part of the diagram (1) can be constructed from iterated pullbacks, and since Reeb quotient maps are stable under pullbacks, the maps in the resulting limit diagram are Reeb quotient maps as well. In a Reeb cone, by commutativity, each of the Reeb functions \({{\tilde{f}}}_i\) induces a unique function \(f_i:V \rightarrow {\mathbb {R}}\). By Corollary 2, the Reeb graph of \(f_i\) is isomorphic to \(R_i\). This way, we pull back the individual functions \({{\tilde{f}}}_i\) to functions \(f_i\) on a common space with the same Reeb graphs, where they can be compared via the supremum norm.

Using these ideas, we can now introduce edit distances on Reeb graphs and proceed to prove that they are stable and universal.

Definition 7

Given a Reeb cone from a space V as in (2), we define the spread of the functions \((f_i)_{i=1,\ldots ,n}:V \rightarrow {\mathbb {R}}\), as the function

$$\begin{aligned}s^V:V\rightarrow {\mathbb {R}}, \; x \mapsto \max \limits _{i=1,\ldots ,n}f_i(x) - \min \limits _{j=1,\ldots ,n}f_j(x). \end{aligned}$$

Moreover, for a Reeb zigzag diagram Z between \(R_f\) and \(R_g\) as in (1), consider the limit of Z, denoted by L. The cost of the Reeb zigzag diagram Z is the supremum norm of the spread \(s^L\),

$$\begin{aligned} c_Z := \Vert s^L\Vert _\infty = \sup _{x\in L}\left( \max _i f_i(x) -\min _j f_j(x)\right) . \end{aligned}$$

Definition 8

We define the (PL) edit distance \(d_{e}\) between Reeb graphs \((R_f, {{\tilde{f}}})\) and \((R_g, {{\tilde{g}}})\) as the infimum cost of all Reeb zigzag diagrams Z between \(R_f\) and \(R_g\):

$$\begin{aligned}d_{e}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})):=\inf _{Z} c_{Z}.\end{aligned}$$

Moreover, we define the graph edit distance \(d_{eGraph}\) between Reeb graphs \((R_f, {{\tilde{f}}})\) and \((R_g, {{\tilde{g}}})\) analogously by restricting the infimum to Reeb zigzag diagrams Z where all the spaces \(X_i\) and \(R_i\) are finite topological graphs.

Thus, on Reeb graphs we have two edit distances, satisfying

$$\begin{aligned} d_{e}\le d_{eGraph}. \end{aligned}$$
(3)

The Reeb graph edit distance \(d_{eGraph}\) is a categorical reformulation of the definition given in [1]. The main goal is to prove that these distances satisfy the stability and universality properties (Propositions 9 and 10, Theorem 2, and Corollary 5). As a consequence, whenever applicable, they will actually coincide with the canonical universal distance \({d}_{U}\) defined in Definition 6:

Corollary 4

\({d}_{U}=d_{e}= d_{eGraph}.\)

The proofs of stability and universality for \(d_{e}\) are straightforward and are given next. The verification of stability and universality for \(d_{eGraph}\) follows in Sect. 5.

Proposition 9

\(d_{e}\) is a stable distance.

Proof

Let \((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})\) be Reeb graphs. For any space X such that there exist two Reeb quotient maps \(p_f:X\rightarrow R_f\) and \(p_g:X\rightarrow R_g\), the diagram

figure j

is a Reeb zigzag diagram with limit object X. The cost of this Reeb zigzag diagram is exactly \(\Vert f-g\Vert _\infty \). Hence, \(d_{e}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}}))\le \Vert f-g\Vert _\infty \). \(\square \)

Our proof of universality of the edit distance is similar to previous universality proofs for the bottleneck distance [7] and for the interleaving distance [14].

Proposition 10

\(d_{e}\) is a universal distance.

Proof

Let \((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})\) be Reeb graphs with \(d_{e}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}}))=:d\). Hence, for any \(\varepsilon >0\), there is a Reeb zigzag diagram Z between \(R_f=R_1\) and \(R_g=R_n\), with limit L and functions \(f_i\) as in Definition 7, having cost

$$\begin{aligned}c_{Z} = \Vert s^L\Vert _\infty = \Vert \max _i f_i-\min _j f_j\Vert _\infty \le d+\varepsilon .\end{aligned}$$

Let \(p_f: L \rightarrow R_f\) and \(p_g: L \rightarrow R_g\) be the induced Reeb quotient maps. If \(d_S\) is any other stable distance (cf. Definition 5) between \(R_f\) and \(R_g\), we have

$$\begin{aligned}d_S((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})) \le \Vert {{\tilde{f}}}\circ p_f-{{\tilde{g}}}\circ p_g\Vert _\infty \le \Vert \max _i f_i-\min _j f_j\Vert _\infty \le d+\varepsilon .\end{aligned}$$

Since the above holds for all \(\varepsilon > 0\), we have \(d_S((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})) \le d = d_{e}((R_f, {{\tilde{f}}}),(R_g, {{\tilde{g}}})).\) \(\square \)

5 Stability and Universality of the Reeb Graph Edit Distance

We now turn to the proof of stability and universality for the Reeb graph edit distance. Recall that, in the case of \(d_{eGraph}\), the admissible Reeb zigzag diagrams are PL zigzags of finite topological graphs. As mentioned above, the distance \(d_{eGraph}\) is applicable to Reeb graphs of compact triangulable spaces.

Lemma 2

Let X be a compact triangulable space, with PL functions \(f, g : X\rightarrow {\mathbb {R}}\), which are assumed to be simplexwise linear on a triangulation \(|K| \cong X\) of X by some simplicial complex K. Let \(\chi :{{\,\mathrm{im}\,}}f\rightarrow {{\,\mathrm{im}\,}}g\) be a weakly monotonic PL surjection such that \(\chi \circ f(v)=g(v)\) for every vertex \(v\in V\) of K. Then there is a Reeb quotient map \(X/{\sim }_f \rightarrow X/{\sim }_g\).

Proof

Without loss of generality, assume \(X = |K|\). For simplicity, we write \(R_f=X/{\sim }_f\), \(R_g=X/{\sim }_g\), and \(R_h=X/{\sim }_h\), where \(h:=\chi \circ f\). Applying Proposition 1, f can be factorized as \(f={{\tilde{f}}}\circ q_f\), where \(q_f: X\rightarrow R_f\) is the canonical projection and \({{\tilde{f}}}: R_f \rightarrow {\mathbb {R}}\) is a Reeb function. Analogously, we obtain \(g={{\tilde{g}}}\circ q_g\) and \(h={{\tilde{h}}}\circ q_h\). We show that there is a Reeb quotient map \(k:X\rightarrow R_h\) making the following diagram commute:

figure k

The claim then follows by applying Lemma 1 to obtain Reeb quotient maps \(R_f \rightarrow R_h\) and \(R_h \rightarrow R_g\), which compose to the desired map \(R_f \rightarrow R_g\).

In order to prove the existence of such a Reeb quotient map k, we define the relation

$$\begin{aligned}k:=q_h\circ ((h^{-1}\circ g) \cap {{\,\mathrm{st}\,}}_K)\end{aligned}$$

on \(X\times R_h\). Here \({{\,\mathrm{st}\,}}_K\) denotes the open star on \(X = |K|\), defined as

$$\begin{aligned}{{\,\mathrm{st}\,}}_K(x) := \{ y \in X \mid \sigma \in K, y \in \sigma ^\circ , x \in \sigma \},\end{aligned}$$

where \(\sigma ^\circ \) is the interior of the simplex \(\sigma \). Note that the converse relation to the open star is the (closed) carrier, \({{\,\mathrm{st}\,}}_K^{-1}={{\,\mathrm{carr}\,}}_K\), where \({{\,\mathrm{carr}\,}}_K(A)\) is the underlying space of the smallest subcomplex of K containing \(A \subseteq X\). For later use, we note that

$$\begin{aligned} k^{-1} = ({{{\,\mathrm{carr}\,}}_K} \circ q_h^{-1})\cap (g^{-1}\circ {{\tilde{h}}}). \end{aligned}$$
(4)

We will also use the open carrier relation \({{\,\mathrm{carr}\,}}_K^\circ \), where \({{\,\mathrm{carr}\,}}_K^\circ (A)\) is the smallest union of open simplices of K covering A. Note that the open carrier relation is symmetric, i.e., \(({{\,\mathrm{carr}\,}}_K^\circ )^{-1}={{\,\mathrm{carr}\,}}_K^\circ \). Moreover, we have \({{\,\mathrm{carr}\,}}_K^\circ \subseteq {{\,\mathrm{st}\,}}_K\).

The remainder of the proof is split into several lemmas. Lemma 3 describes the behavior of the functions h and g on the simplices of K. Lemma 4 shows that k is a continuous surjection, and Lemma 5 shows that k has connected fibers. Since \({\tilde{h}}\circ k = g\), we conclude that k is PL. Thus, k is a Reeb quotient map, and the claim follows from Lemma 1. \(\square \)

Lemma 3

For every simplex \(\sigma \) in K, \(g(\sigma )=h(\sigma )\) and \(g(\sigma ^\circ ) \subseteq h(\sigma ^\circ )\).

Proof

We have \(h(\sigma )=g(\sigma )\) because h is equal to g on the vertices of K, and \(h=\chi \circ f\) with f linear on \(\sigma \) and \(\chi \) a weakly monotonic surjection.

To show that \(g(\sigma ^\circ ) \subseteq h(\sigma ^\circ )\), note that since g is linear on \(\sigma \), either g is constant on \(\sigma \) and so \(g(\sigma ^\circ ) = g(\sigma ) = h(\sigma ) = h(\sigma ^\circ )\), or \(g(\sigma ^\circ ) = (g(v),g(w))\) for some vertices vw of \(\sigma \). In the latter case, since h and g coincide on the vertices, we have \(g(\sigma ^\circ ) = g(\sigma )^\circ = h(\sigma )^\circ \). Finally, \(h(\sigma ^\circ )\) is an interval whose closure is \(h(\sigma )\), and thus we have \(h(\sigma )^\circ \subseteq h(\sigma ^\circ )\) and the claim follows. \(\square \)

Lemma 4

k is a continuous surjection.

Proof

Recall that the relation \(k \subseteq X \times R_h\) is a partial set map if for any \(x \in X\) and \(y,y' \in k(x)\), we have \(y=y'\). Moreover, a partial set map k is a (total) set map if for every \(x\in X\), \(k(x)\ne \emptyset \). Finally, a set map k is a surjection if for every \(y \in R_h\), there is some \(x\in k^{-1}(y)\).

We first show that k is a partial set map, i.e., for any \(x \in X\) and \(y,y' \in k(x)\), we have \(y=y'\). To see this, let \(t=g(x)\) and note that \({{\tilde{h}}}(y)={{\tilde{h}}}(y')=t\). Let \(\sigma \in K\) be such that \(x \in \sigma ^\circ \). By Lemma 3, there is a point \(\zeta \in \sigma ^\circ \) with \(h(\zeta )=g(x)=t\); in particular,

$$\begin{aligned}\zeta \in h^{-1}(t) \cap {{\,\mathrm{st}\,}}_K(x).\end{aligned}$$

Furthermore, there are points \(\xi ,\xi '\in h^{-1}(t) \cap {{\,\mathrm{st}\,}}_K(x)\) with \(\xi \in q_h^{-1}(y)\) and \(\xi '\in q_h^{-1}(y')\). But since \(h^{-1}(t) \cap \tau \) is necessarily connected for every simplex \(\tau \), we know that \(\zeta \) lies in the same connected component of \(h^{-1}(t) \cap {{\,\mathrm{st}\,}}_K(x)\) as both \(\xi \) and \(\xi '\), and so we have \(y=q_h(\xi )=q_h(\xi ')=y'\) as claimed.

To show that k is a set map, we need to show that for every \(x\in X\), \(k(x)\ne \emptyset \). It suffices to show that for every \(x\in X\), \({{\,\mathrm{st}\,}}_K(x)\) contains a point \(x'\) with \(h(x')=g(x)\). This follows by considering the simplex \(\sigma \in K\) with \(x \in \sigma ^\circ \). Now by Lemma 3, there is a point \(x' \in \sigma ^\circ \subseteq {{\,\mathrm{st}\,}}_K(x)\) with \(h(x')=g(x)\) as claimed.

To show that k is surjective, we show, using Eq. (4), that for every \(y \in R_h\), there is some

$$\begin{aligned}x\in k^{-1}(y) = ({{{\,\mathrm{carr}\,}}_K} \circ q_h^{-1})(y)\cap (g^{-1}\circ {{\tilde{h}}})(y),\end{aligned}$$

or equivalently, there is some \(x\in {{{\,\mathrm{carr}\,}}_K} \circ q_h^{-1}(y)\) such that \(g(x)= {{\tilde{h}}}(y)\). If \(q_h^{-1}(y)\) contains some vertex v of K, choose \(x=v\). Otherwise, let \(\xi \in q_h^{-1}(y)\), and let \(\sigma \in K\) be such that \(\xi \in \sigma ^\circ \). Now by Lemma 3 there is a point \(x \in \sigma \subseteq {{{\,\mathrm{carr}\,}}_K} \circ q_h^{-1}(y)\) with \(g(x) = h(\xi ) = {{\tilde{h}}}(y)\).

Finally, to show that k is continuous, we show that for every closed subset L of \(R_h\), the preimage \(k^{-1}(L)\) is closed. Since \(k^{-1}=({{{\,\mathrm{carr}\,}}_K}\circ q_h^{-1})\cap (g^{-1}\circ {{\tilde{h}}})\), it is sufficient to show that both \({{{\,\mathrm{carr}\,}}_K}\circ q_h^{-1}(L)\) and \(g^{-1}\circ {{\tilde{h}}}(L)\) are closed in X. First note that \({{{\,\mathrm{carr}\,}}_K}\circ q_h^{-1}(L)\) is closed as a subcomplex of K. Furthermore, the image \({{\tilde{h}}}(L)\) is closed by the closed map lemma. By continuity of g it follows that \(g^{-1}\circ {{\tilde{h}}}(L)\) is closed in X. \(\square \)

Lemma 5

The fibers of k are connected.

Proof

Let \(y \in R_h\) be a point in the Reeb graph with value \(t = {{\tilde{h}}}(y)\), and let \(C = q_h^{-1}(y) \subseteq h^{-1}(t)\) the corresponding component of the level set of h. Let \(U={{\,\mathrm{carr}\,}}_K (C)\), and let L be the corresponding subcomplex of K. Recall that f is linear on every simplex \(\sigma \in L\) and \(\chi \) is piecewise linear. Restricting the level set \(h^{-1}(t)\) of \(h = \chi \circ f\) to \(\sigma \) thus yields a connected subset \(\sigma \cap h^{-1}(t)\), and so we have \(\sigma \cap h^{-1}(t) = \sigma \cap C\). Taking the union over all such simplices and using \(C \subseteq U\) yields \(U \cap h^{-1}(t) = U \cap C = C\). Moreover, writing \(D=k^{-1}(y)\), by (4) we have \(D = U \cap g^{-1}(t)\). To prove that D is connected, it is sufficient to show that C and D have finite closed covers with isomorphic nerves; since C is connected, both nerves and hence also D are then connected too.

The cover of C is given by \(\{\sigma \cap C \mid \sigma \in L\}\), and similarly the cover of D is \(\{\sigma \cap D \mid \sigma \in L\}\). Observe that any two cover elements of C, say \(\sigma \cap C\) and \(\tau \cap C\), have a nonempty intersection \((\sigma \cap C)\cap (\tau \cap C) = (\sigma \cap \tau ) \cap C\) if and only if \(t \in h(\sigma \cap \tau )\). Similarly, \(\sigma \cap D\) and \(\tau \cap D\) have nonempty intersection if and only if \(t \in g(\sigma \cap \tau )\). But \(g(\sigma \cap \tau )=h(\sigma \cap \tau )\) by Lemma 3, and so the nerves of both covers are isomorphic as claimed. \(\square \)

We thus have shown the existence of the Reeb quotient map k. This completes the proof of Lemma 2. We will now apply Lemma 2 to construct Reeb graph edit zigzags from straight line homotopies.

Lemma 6

Let X be a compact triangulable space, with PL functions \(f, g : X\rightarrow {\mathbb {R}}\), simplexwise linear on a triangulation \(|K| \cong X\). Consider the straight line homotopy \({f}_\lambda =(1-\lambda ) f+\lambda g\), with \(0\le \lambda \le 1\). Then there exists a partition \(0=\lambda _1<\cdots <\lambda _n= 1\) such that for every \(1 \le i < n\) and \(\rho \in (\lambda _i,\lambda _{i+1})\), there exist weakly monotonic PL surjections \(\chi _i:{{\,\mathrm{im}\,}}{f}_{\rho } \rightarrow {{\,\mathrm{im}\,}}f_{\lambda _i}\) and \(\xi _{i+1}:{{\,\mathrm{im}\,}}{f}_{\rho } \rightarrow {{\,\mathrm{im}\,}}f_{\lambda _{i+1}}\) with

$$\begin{aligned}\chi _i\circ {f}_{\rho }(v)=f_{\lambda _i}(v) ~~ \text {and} ~~ \xi _{i+1}\circ { f}_{\rho }(v)=f_{\lambda _{i+1}}(v)\end{aligned}$$

for every vertex v of K.

Proof

Consider the set of values \(0< \lambda < 1\) such that there exist vertices \(v,w\in K\) with

$$\begin{aligned} { f}_{\lambda }(v)={f}_{\lambda }(w), ~~\text {but}~~ { f}_{\rho }(v)\ne { f}_{\rho }(w) ~~ \text {for every}~~ \rho \ne \lambda . \end{aligned}$$

This set is finite because the function \(\lambda \mapsto f_\lambda (v)-f_\lambda (w)\) is linear and K has a finite number of vertices. Let \(\{\lambda _i\}_{1\le i\le n}\) be this set together with 0 and 1, indexed in ascending order. By the linearity of \({ f}_\lambda \) with respect to the parameter \(\lambda \), we also see that the order induced by \(f_\rho \) on the vertices is the same for every \(\rho \in (\lambda _i,\lambda _{i+1})\). Indeed, if there exist two distinct vertices vw of K such that \({f}_{\rho }(v)={f}_{\rho }(w)\) for some \(\rho \in (\lambda _i,\lambda _{i+1})\), then \({ f}_{\lambda }(v)={ f}_{\lambda }(w)\) for every \(\lambda \in [0,1]\). By continuity, the order is still weakly preserved along \([\lambda _i,\lambda _{i+1}]\).

Therefore, the function \(f_\rho (v) \mapsto { f}_{\lambda _i}(v)\) is well-defined and can be extended to a piecewise linear function \(\chi _i\) satisfying the claim. The function \(\xi _{i+1}\) can be defined similarly. \(\square \)

Theorem 2

\(d_{eGraph}\) is a stable distance.

Proof

Let \(X \cong |K|\) be a compact triangulable space with \(f, g : X\rightarrow {\mathbb {R}}\) be PL functions, simplexwise linear on K; without loss of generality, assume \(X = |K|\). Consider the straight line homotopy \({f}_\lambda =(1-\lambda ) f+\lambda g\), with \(0\le \lambda \le 1\), and take values \(\lambda _i\in [0,1]\), \(1 \le i \le n\), as in Lemma 6. Set \(\rho _i=(\lambda _i+\lambda _{i+1})/2\).

We first define a Reeb cone of the form (2), with \(V=X\), \(R_i=X/_{\sim {f}_{\lambda _i}}\), \(i=1,\ldots ,n\), and \(X_i=X/_{\sim {f}_{\rho _i}}\), \(i=1,\ldots ,n-1\). The canonical projections \(q_{\rho _i}: X \rightarrow X_i\) and \(q_{\lambda _i}: X \rightarrow R_i\) are Reeb quotient maps, and the Reeb functions \(R_i\rightarrow {\mathbb {R}}\) are induced by \({f}_{\lambda _i}\) as in Proposition 1. To complete the construction, we show that there are Reeb quotient maps \(p_i: X/{\sim }_{{ f}_{\rho _i}}\rightarrow X/{\sim }_{{f}_{\lambda _i}}\) and \(o_{i+1}: X/{\sim }_{{f}_{\rho _i}}\rightarrow X/{\sim }_{{f}_{\lambda _{i+1}}}\) that make the following diagram commute:

figure l

We prove the existence of \(p_i\), that of \(o_{i+1}\) being analogous. By Lemma 6, there is a weakly monotonic PL surjection \(\chi _i: {{\,\mathrm{im}\,}}f_{\rho _i} \rightarrow {{\,\mathrm{im}\,}}f_{\lambda _i}\) such that \(\chi _i\circ f_{\rho _i}=f_{\lambda _i}\). Hence, Lemma 2 provides the desired Reeb quotient map \(p_i: X/{\sim }_{{ f}_{\rho _i}} \rightarrow X/{\sim }_{{f}_{\lambda _i}}\).

Now consider the limit L over the resulting Reeb zigzag diagram Z consisting of the maps \(p_i\) and \(o_i\), with maps \(r_i: L \rightarrow X_i\) and \(s_i: L \rightarrow R_i\). Since the maps from X in the above Reeb cone factor through a unique map \(m :X \rightarrow L\) by the universal property of the limit, we obtain the commutative diagram

figure m

We have \(f_{\lambda _i} =f^L_{\lambda _i}\circ m\) for \(1\le i \le n\), with \(f_{\lambda _i}^L={{\tilde{f}}}_{\lambda _i}\circ s_i .\) Hence, for every \(\ell \in L\),

$$\begin{aligned}s^L(\ell )= \max _j f_{\lambda _j}^L(\ell )-\min _k f_{\lambda _k}^L(\ell ) \le \sum _{i=1}^{n - 1}|f^L_{\lambda _{i+1}}(\ell )-f^L_{\lambda _{i}}(\ell )|.\end{aligned}$$

By the surjectivity of \(q_{\rho _{i}}\), for every i there is \(x_{\ell ,i}\in X\) such that \(q_{\rho _{i}}(x_{\ell ,i})=r_i(\ell )\). Thus,

$$\begin{aligned} |f_{\lambda _{i+1}}^L(\ell )-f_{\lambda _i}^L(\ell )|&=|f_{\lambda _{i+1}}(x_{\ell ,i})-f_{\lambda _i}(x_{\ell ,i})| \le (\lambda _{i+1}-\lambda _i)\cdot \Vert f-g\Vert _\infty . \end{aligned}$$

Combining the above for every \(\ell \in L\) we have

$$\begin{aligned}s^L(\ell )\le \sum _{i=1}^{n-1}(\lambda _{i+1}-\lambda _{i})\cdot \Vert f-g\Vert _\infty = \Vert f-g\Vert _\infty .\end{aligned}$$

We conclude that

$$\begin{aligned}d_{e}(R_f,R_g) \le c_{Z} = \Vert s^L\Vert _\infty \le \Vert f-g\Vert _\infty ,\end{aligned}$$

showing that \(d_{e}\) is a stable distance. \(\square \)

Figure 3 shows the two steps edit zigzag induced on Reeb graphs by a straight line homotopy between the functions considered in Example 2. This has to be compared with the one-step zigzag shown in Fig. 1.

Fig. 3
figure 3

The graph edit zigzag induced by the straight line homotopy between the functions from Example 2

Corollary 5

\(d_{eGraph}= {d}_{U}\) is the universal distance.

Proof

The claim is a direct consequence of inequality (3) together with Theorem 2 and Propositions 9 and 10. \(\square \)

6 Discussion

Motivated by questions arising in topological data analysis, in this paper we have introduced three constructions of a distance between Reeb graphs: \({d}_{U}\), \(d_{e}\), \(d_{eGraph}\). These constructions have considerably different combinatorial flavors. \(d_{eGraph}\) is completely combinatorial, as it is based on graph zigzagging, with zigzags interpretable as honest graph edit operations as in [1]. In the definition of \(d_{e}\), edit zigzagging operations are relaxed to also comprise non-graph spaces. Finally, \({d}_{U}\) does not even allow for zigzagging. In spite of these differences, we have shown that these three distances coincide. The main implication of this result is that computing the universal distance for Reeb graph boils down to a combinatorial problem (though a hard one, as discussed below).

We believe that the following questions are of interest and could motivate further research:

  • Do minimizers in the definition of the universal distance always exist? Besides guaranteeing that the universal distance would then be geodesic, this would also have algorithmic implications. See below.

  • Is the interleaving distance [8] bi-Lipschitz equivalent to the universal distance? If the answer to this question is affirmative, then by results of [3], one would obtain the bi-Lipschitz equivalence between the universal distance and the functional distortion distance from [2].

  • What is the computational complexity of the universal distance? This problem is at least graph-isomorphism hard, which can be seen as follows. First note that bipartite graphs form a graph-isomorphism complete class of graphs.

    Any bipartite simple graph can be interpreted as a Reeb graph with function values in \(\{0,1\}\) corresponding to the partition of the vertex set. Using Corollary 3, these Reeb graphs are at universal distance 0 if and only if the bipartite graphs are isomorphic, so both of these decision problems are graph-isomorphism complete. A similar observation has been made for the interleaving distance [8].

    These considerations motivate the following three ancillary questions:

    1. (a)

      Is the universal distance a minimum over a certain finite set, possibly of cardinality polynomially dependent on the size of the input Reeb graphs?

    2. (b)

      In a similar vein to (a) above: What are suitable parameters for measuring the structural complexity of Reeb graphs so that the problem of computing the universal distance between any two Reeb graphs becomes fixed parameter tractable (FPT) [12]? Some success in a related direction has been reported by Touli and Wang [21] who identify a certain growth condition for merge trees under which they are able to provide fixed parameter algorithms for approximating the interleaving distance.

    3. (c)

      In more generality than (a) above: are the possible values of the universal distance always contained in some canonical set of values, constructed from the sets of vertex function values of the two Reeb graphs? Related results in the context of manifolds endowed with Morse functions appear in the work of Donatini and Frosini [10]. This work carries over to the setting of Reeb graphs by the results of [9].

  • How do the theoretical properties of the universal distance extend to more general settings?

    • The definition of the universal distance also makes sense in a more general topological setting, where we consider locally compact Hausdorff spaces as Reeb domains and proper quotient maps with connected fibers as Reeb quotient maps. The distance one obtains in this larger category can still be applied to finite Reeb graphs, in which case it will be smaller or equal to the PL universal distance that we described in this paper. However, we conjecture that in this case the two distances actually coincide.

    • Reeb spaces: Generalizing our definitions and results up to Sect. 5 to Reeb spaces of piecewise linear maps \(X \rightarrow {\mathbb {R}}^n\) is straightforward. Do our results from Sect. 5 generalize as well?