1 Introduction

A quiver representation is an arrangement of vector spaces and linear maps tethered to the vertices and edges of a directed graph [14, 43]. The quiver illustrated below will be our running example throughout the paper.

figure a

Despite being relatively concrete mathematical objects, quiver representations provide a uniform framework for a host of fundamental abstract problems in linear algebra [7]. Isomorphisms of quiver representations can be used to characterise, for example, the Jordan normal form of matrices and the Kronecker normal form of matrix pencils. They also play an important role in various other fields, including the study of associative algebras [3], Gromov–Witten invariants [22], representations of Kac–Moody algebras [39], moduli stacks [48], Morse theory [35], persistent homology [41], and perverse sheaves [19], among others.

In most of these contexts, the crucial property of a given quiver representation is its decomposability into a direct sum of smaller representations. Gabriel’s celebrated result [18] establishes that a quiver admits finitely many (isomorphism classes of) indecomposable representations if and only if its underlying undirected graph is a union of simply laced Dynkin diagrams (i.e. type A, D or E). Thus, most quivers have rather complicated sets of indecomposable representations and are said to be of wild type. It is a direct consequence of this trifecta—concreteness, ubiquity and generic wildness—that ideas from disparate branches of mathematics have conversely been deployed to study representations of quivers. These include algebraic geometry [34], combinatorics [13], differential geometry [23, 25], geometric representation theory [20], invariant theory [32, 33], and multilinear algebra [27].

Quiver representations have recently emerged in far more applied and computational contexts than the classical ones listed above. We are aware of three such appearances:

  1. (1)

    Cellular Sheaves: A vector space-valued sheaf defined over a cell complex [10, 11] constitutes a representation of the underlying Hasse diagram; here the vertices are cells and edges arise from face inclusions. The stalks of the sheaf form vector spaces over the vertices, while restriction maps are associated to edges.

  2. (2)

    Conley theory: Morse decompositions in computational dynamics [26, Def 9.19] are representations of Conley–Morse quivers associated to discrete dynamical systems (vertices are recurrent sets and edges represent gradient flow). The linear maps of such representations arise from connection matrices [17]; these assemble into a chain complex that allows one to recover the homology of the phase space.

  3. (3)

    Algebraic statistics: Matrix normal models can be studied via quiver representations [1, 12]. The sample data gives a representation of a Kronecker quiver. The stability of the representation [33] can then be used to characterise the existence and uniqueness of a maximum likelihood estimate in the model.

We expect (and hope) that this influx of quiver theory into more applied and computational domains will continue.

1.1 This Paper

We consider a representation \(\mathbf {A}_\bullet \) of a quiver Q. It assigns vector spaces \(\mathbf {A}_v\) to each vertex v of Q and linear maps \(\mathbf {A}_e\) to each edge e in Q. We construct a vector space \(\Gamma (Q;\mathbf {A}_\bullet )\) called the space of sections of the quiver representation. An element of \(\Gamma (Q;\mathbf {A}_\bullet )\) selects one vector \(\gamma _v\) from the vector space \(\mathbf {A}_v\) assigned to each vertex v so that for every edge \(e:u \rightarrow v\) the linear map \(\mathbf {A}_e\) sends \(\gamma _u\) to \(\gamma _v\). As such, \(\Gamma (Q;\mathbf {A}_\bullet )\) is a subspace of the total space \(\text {Tot}(\mathbf {A}_\bullet ) := \prod _v \mathbf {A}_v\). The assignment

$$\begin{aligned} \mathbf {A}_\bullet \mapsto \Gamma (Q;\mathbf {A}_\bullet ) \end{aligned}$$

can directly be seen to be a functor from the category of Q-representations to the category of vector spaces. We do not expect this functor to immediately answer any deep questions regarding (in)decomposability of quiver representations. Rather, we hope that the space of sections will become a useful and practical tool for those who encounter quiver representations in applied and computational contexts.

Our first contribution is an algorithm for computing the space of sections for any finite-dimensional representation of a finite quiver. This is of some relevance even to those who have no warm feelings for quiver representations, since it is a purely categorical procedure for computing the limit (i.e. the universal cone) of a diagram in the category of vector spaces. With minor modifications, it can be made to work for diagrams valued in any abelian category that has computable products and equalisers. There are two types of restriction imposed on the space of sections: the first of these arises from directed cycles, where we are forced to restrict to a fixed point space of an endomorphism; and the second is the presence of multiple incoming edges at a vertex, where we are forced to restrict to an equaliser. None of these difficulties arise when the quiver is a directed rooted tree.

Our algorithm consists of two steps—the first step removes all directed cycles and updates the representation \(\mathbf {A}_\bullet \) accordingly; and the second step replaces this acyclic quiver with a directed rooted tree, again updating the representation. The result is a new representation \(\mathbf {A}^+_\bullet \) of a rooted directed tree \(T^+\), which has all the same vertices as Q (plus an additional root vertex) and satisfies \(\mathbf {A}^+_v \subset \mathbf {A}_v\) at each vertex v.

Here is our first main result.

Theorem

(A). The space of sections \(\Gamma (Q;\mathbf {A}_\bullet )\) is the image of the map

$$\begin{aligned} F:\mathbf {A}^+_\rho \longrightarrow \mathrm{Tot}(\mathbf {A}_\bullet ), \end{aligned}$$

obtained by composing the linear maps assigned by the quiver representation \(\mathbf {A}^+_\bullet \) along the unique path in the rooted directed tree \(T^+\) from the root \(\rho \) to each other vertex.

Although the constructions of \(T^+\) and \(\mathbf {A}_\bullet ^+\) are explicit and readily implementable on a computer, they require making several intermediate choices. Each such choice is liable to produce a different F, but its image is always \(\Gamma (Q;\mathbf {A}_\bullet )\) regardless of these choices.

Our second contribution takes place in the realm of quiver representations valued in real vector spaces; in this case, a map F as described in Theorem (A) can be represented by an \(n \times d\) full-rank real matrix, where n and d are the dimensions of \(\text {Tot}(\mathbf {A}_\bullet )\) and \(\Gamma (Q;\mathbf {A}_\bullet )\), respectively. Using this matrix, we define the principal components of any (generic, mean-centred) finite set D of vectors in \(\mathbb {R}^n \simeq \text {Tot}(\mathbf {A}_\bullet )\) with respect to the quiver representation \(\mathbf {A}_\bullet \). As with ordinary principal components, the starting point is the \(n \times n\) sample covariance matrix S of the vectors in D. Next, we consider for each \(r \le d\) the variational problem of maximising the trace \(\mathrm{tr}(X^{\mathsf {T}}SX)\) over the set of all \(n \times r\) matrices X that satisfy \(X^{\mathsf {T}}X = \text {id}\), and whose columns are constrained to lie in the image of F, i.e. in \(\Gamma (Q;\mathbf {A}_\bullet )\). There is a generically unique solution, obtained by iteratively incrementing r from 1 to d, and the span of its r-th column is the r-th principal component of D along \(\mathbf {A}_\bullet \), denoted \(\mathbf{PC} _r(D;\mathbf {A}_\bullet )\). Unlike ordinary principal components, the \(\mathbf{PC} _r(D;\mathbf {A}_\bullet )\) are not spanned by eigenvectors of S in general.

The second main result of this paper is that the quiver principal components \(\mathbf{PC} _r(D;\mathbf {A}_\bullet )\) do in fact admit a spectral interpretation.

Theorem

(B). For each \(1 \le r \le d\), the r-th principal component \(\mathbf{PC}_r(D;\mathbf {A}_\bullet )\) is spanned by \(Fu_r\), where \(u_r\) is the eigenvector of the matrix pencil \(F^{\mathsf {T}}S F - \lambda (F^{\mathsf {T}}F)\) corresponding to its r-th largest eigenvalue.

We see from the matrix pencil in Theorem (B) that the principal components along a quiver representation intertwine the properties of D (via the sample covariance matrix S) with those of quiver Q (via the map F to the space of sections). These principal components find directions of maximum variation among vectors in \(D \subset \mathbb {R}^n\) that respect certain linear dependencies. The coordinates in \(\mathbb {R}^n\) can be thought of as partitioned into blocks (one per vertex of the quiver); these blocks are related by the linear maps of the quiver representation. In this way, the principal components along the quiver representation interpolate between concatenating ordinary principal components from individual blocks, and ordinary principal components in the whole space \(\mathbb {R}^n\). We will mostly assume that the linear maps are fixed in advance, but we will briefly discuss approaches to learning them from the set D.

1.2 Related Work

The first half of this work is inspired by the study of cellular sheaves [10], which functorially assign vector spaces to cells and linear maps to incidence relations in a finite cell complex. The space of sections of a cellular sheaf \(\mathscr {S}\) defined over an undirected graph G is isomorphic to the zeroth sheaf cohomology group \(\mathbf{H} ^0(G;\mathscr {S})\), which is readily computable [11]. We can turn any representation \(\mathbf {A}_\bullet \) of a quiver Q into a cellular sheaf over the underlying undirected graph by replacing each edge-indexed linear map

$$\begin{aligned} \mathbf {A}_u {\mathop {\longrightarrow }\limits ^{\mathbf {A}_e}} \mathbf {A}_v \end{aligned}$$

by a corresponding zigzag of the form

$$\begin{aligned} \mathbf {A}_u {\mathop {\longrightarrow }\limits ^{\mathbf {A}_e}} \mathbf {A}_v {\mathop {\longleftarrow }\limits ^{\text {id}}} \mathbf {A}_v. \end{aligned}$$

Thus, each edge inherits the vector space assigned to its target vertex. Computing zeroth cohomology of this sheaf furnishes an alternative to Theorem (A) for calculating \(\Gamma (Q;\mathbf {A}_\bullet )\). However, this cohomological alternative suffers from two significant drawbacks—first, the insertion of these zigzags is quite inconvenient for our purposes of testing compatibility of sections across directed paths in the original quiver. And second, the duplication of vector spaces over the edges leads to unnecessarily large matrices and hence incurs a larger computational cost.

A central focus of the second half of this paper is the study of linearly constrained principal components, which dates back at least to [42, Section 11]. It is referred to as constrained PCA in [15, Section 7.1] and [45, 46]. Its statistical implications are discussed in [29] and [45, Section 5.4]. For an example of constrained PCA occurring in a biological context, see [28]. It is important to note that the principal components \(\mathbf{PC} _r(D;\mathbf {A}_\bullet )\) introduced in this paper do not constitute a low-rank approximation of the representation \(\mathbf {A}_\bullet \). Such approximation of related multi-linear objects appears in [8], where the authors find the singular value decomposition of a finite chain complex, and in the study of orthogonal decomposition of tensor networks [24]. A study of star quivers for parameter estimation in integrated PCA [47] appears in [16].

We comment on connections to linear neural networks in Sect. 8. Quiver representations appear in the context of neural network architectures in [2, 30], though these are not usual quiver representations due to the presence of nonlinear activation functions.

1.3 Organisation

The remainder of this paper is divided into eight short sections. In Sect. 1, we define quiver representations, their sections, and some elementary properties thereof.

Section 2 is devoted to the task of using the ear decomposition to compute the sections of strongly-connected quivers. Section 3 uses the results of Sect. 2 to construct, from any given quiver representation, a sub-representation of an acyclic subquiver that has the same space of sections. In Sect. 4 we describe how to further modify this acyclic subquiver into a rooted directed tree and update the overlaid representation to preserve the space of sections. These intermediate results are assembled in Sect. 5 to provide a proof of Theorem (A); we also give lower bounds on the dimension of the space of sections and provide pseudocode for our algorithm along with a computational complexity analysis.

Principal components along quiver representations are defined in Sect. 6 via three optimisation problems; we show here that all three give the same answer. In Sect. 7 we use a generalisation of the singular value decomposition to establish Theorem (B). And finally, Sect. 8 discusses the problem of learning the linear maps of a quiver representation from finite samples of vectors living in the total space.

2 Quiver Representations and Sections

A quiver Q consists of a finite set V whose elements are called vertices, a finite set E whose elements are called edges, and two maps \(s,t:E \rightarrow V\) called the source and target map, respectively. It is customary to illustrate quivers by drawing points for vertices and arrows (from source to target) for edges. A path in Q is an ordered finite sequence of distinct edges \(p = (e_1,e_2,\ldots ,e_k)\) with disjoint sources (i.e. \(s(e_i) \ne s(e_j)\) when \(i \ne j\)) so that \(s(e_{i+1})=t(e_i)\) holds for every \(1 \le i < k\):

The source and target maps extend from edges to paths via \(s(p) = s(e_1)\) and \(t(p) = t(e_k)\). We call p a cycle if \(s(p) = t(p)\), and call Q acyclic if it does not admit any cycles.

A representation of Q comprises an assignment \(\mathbf {A}_\bullet \) of a finite-dimensional vector space \(\mathbf {A}_v\) to every vertex v in V and a linear map \(\mathbf {A}_e:\mathbf {A}_{s(e)} \rightarrow \mathbf {A}_{t(e)}\) to every edge e in E. We will remain agnostic to the choice of underlying field until Sect. 6. Using the data of \(\mathbf {A}_\bullet \), one can associate to each path \(p = (e_1,\ldots ,e_k)\) the map \(\mathbf {A}_p:\mathbf {A}_{s(p)} \rightarrow \mathbf {A}_{t(p)}\) via

$$\begin{aligned} \mathbf {A}_p := \mathbf {A}_{e_k} \circ \mathbf {A}_{e_{k-1}} \circ \cdots \circ \mathbf {A}_{e_2} \circ \mathbf {A}_{e_1}. \end{aligned}$$
(1)

The total space of \(\mathbf {A}_\bullet \) is the direct product

$$\begin{aligned} \text {Tot}(\mathbf {A}_\bullet ) := \prod _{v \in V} \mathbf {A}_v. \end{aligned}$$

The following terminology has been borrowed from analogous notions that arise in the study of sheaves and vector bundles.

Definition 1.1

Let \(\mathbf {A}_\bullet \) be a representation of a quiver \(Q = (s,t:E \rightarrow V)\). A section of \(\mathbf {A}_\bullet \) is an element \(\gamma = {\left\{ {\gamma _v \in \mathbf {A}_v \mid v \in V}\right\} }\) in \(\text {Tot}(\mathbf {A}_\bullet )\) satisfying the compatibility requirement \(\gamma _{t(e)} = \mathbf {A}_e (\gamma _{s(e)})\) across each edge e in E.

The set of all sections of \(\mathbf {A}_\bullet \) is a vector subspace of \(\text {Tot}(\mathbf {A}_\bullet )\), which we denote by \(\Gamma (Q;\mathbf {A}_\bullet )\). The explicit computation of the space of sections \(\Gamma (Q;\mathbf {A}_\bullet )\), for any quiver Q and representation \(\mathbf {A}_\bullet \), is one of the central objectives of this work.

Remark 1.2

The product of general linear groups \(\text {G} = \prod _v \text {GL}(\mathbf {A}_v)\) acts on \(\mathbf {A}_\bullet \) by change of basis: given any \(g = {\left\{ {g_v \in \text {GL}(\mathbf {A}_v) \mid v \in V}\right\} }\), the new representation \(g\mathbf {A}_\bullet \) assigns

$$\begin{aligned} (g\mathbf {A})_v = \mathbf {A}_v \qquad \text {and} \qquad (g\mathbf {A})_e = g_{t(e)} \circ \mathbf {A}_e \circ g_{s(e)}^{-1}. \end{aligned}$$

This action descends to the space of sections via \(\gamma \mapsto g\gamma \), where \((g\gamma )_v = g_v \gamma _v\), and so we have an isomorphism \( \Gamma (Q;\mathbf {A}_\bullet ) \simeq \Gamma (Q;g\mathbf {A}_\bullet )\) for every \(g \in \text {G}\). In fact, a purely formal argument shows that the assignment \(\mathbf {A}_\bullet \mapsto \Gamma (Q;\mathbf {A}_\bullet )\) is a functor from the category of representations of a fixed Q to the category of vector spaces. Here a morphism \(\mathscr {F}_\bullet :\mathbf {A}_\bullet \rightarrow \mathbf {A}'_\bullet \) of Q-representations is a collection of V-indexed linear maps \(\mathscr {F}_v:\mathbf {A}_v \rightarrow \mathbf {A}'_v\) which commute with the edge-maps, i.e. for each \(e \in E\) we have

$$\begin{aligned} \mathbf {A}'_e \circ \mathscr {F}_{s(e)} = \mathscr {F}_{t(e)} \circ \mathbf {A}_e. \end{aligned}$$
(2)

Each section \(\gamma \in \Gamma (Q;\mathbf {A})\) is sent by \(\mathscr {F}_\bullet \) to a section \(\mathscr {F}\gamma \) of \(\mathbf {A}'_\bullet \) prescribed by \((\mathscr {F}\gamma )_v = \mathscr {F}_v(\gamma _v)\), since applying (2) to \(\gamma _{s(e)}\) gives the desired compatibility across each edge e:

$$\begin{aligned} \mathbf {A}'_e \circ \mathscr {F}_{s(e)}(\gamma _{s(e)}) = \mathscr {F}_{t(e)} \circ \mathbf {A}_e (\gamma _{s(e)}) = \mathscr {F}_{t(e)} \gamma _{t(e)}. \end{aligned}$$

Compatibility across edges imposes severe constraints on sections, even in the simplest of examples, when the underlying quiver Q contains cycles or vertices with multiple incoming edges.

Example 1.3

Consider the quiver that consists of a single vertex v and a single edge e with \(s(e) = v = t(e)\). The space of sections of any representation \(\mathbf {A}_\bullet \) is the subspace of \(\mathbf {A}_v\) fixed by \(\mathbf {A}_e\), i.e. the eigenspace corresponding to eigenvalue 1.

Example 1.4

The space of sections of a representation \(\mathbf {A}_\bullet \) of the 2-Kronecker quiver, pictured below, is isomorphic to \(\ker (\mathbf {A}_e - \mathbf {A}_f)\).

In sharp contrast, sections are far less constrained when the vertices of Q admit at most one incoming edge.

Example 1.5

Given vector spaces UVW along with linear maps \(A:V \rightarrow U\) and \(B:V \rightarrow W\), the sections of the quiver representation

are triples of the form \(\gamma = (Ax,x,Bx)\) for x in V.

More generally, consider the case where Q admits a distinguished vertex \(\rho \) in V called the root so that for each other vertex \(v \ne \rho \) there is a unique path p[v] in Q from \(\rho \) to v. Quivers satisfying this unique path property are studied in various contexts and hence have many names—these include out-trees, out-branchings, directed rooted trees, and (the far more scenic) arborescences [6, Chapter 9].

Proposition 1.6

Let \(\mathbf {A}_\bullet \) be a representation of an arborescence Q with root vertex \(\rho \). The space of sections \(\Gamma (Q;\mathbf {A}_\bullet )\) is isomorphic to \(\mathbf {A}_\rho \), with every section \(\gamma \) uniquely determined by the vector \(x = \gamma _\rho \) in \(\mathbf {A}_\rho \), via

$$\begin{aligned} \gamma _v = \mathbf {A}_{p[v]}(x), \end{aligned}$$

where p[v] is the unique path in Q from \(\rho \) to \(v \ne \rho \).

Over the next three sections, we will describe an algorithm to compute \(\Gamma (Q;\mathbf {A})\) for any given representation \(\mathbf {A}_\bullet \) of an arbitrary quiver Q.

Remark 1.7

As described in Definition 1.1, an edge \(e: u \rightarrow v\) of the quiver Q imposes \(\dim \mathbf {A}_{s(e)}\) linear constraints (which may not be independent) on \(\text {Tot}(\mathbf {A}_\bullet )\). The space of sections \(\Gamma (Q;\mathbf {A}_\bullet )\) is the subspace that satisfies all such constraints, the kernel of a matrix of size

$$\begin{aligned} \left( \sum _{e \in Q} \dim \mathbf {A}_{t(e)} \right) \times \dim \text {Tot}(\mathbf {A}_\bullet ). \end{aligned}$$

In principle, this kernel may be computed directly via Gaussian elimination. We take an alternative approach, which makes use of the structure of Q, for two compelling reasons:

  1. (1)

    Working in the space \(\text {Tot}(\mathbf {A}_\bullet )\) quickly becomes prohibitive when the number of vertices or edges of the quiver is large. In contrast, our approach computes the space of sections by performing Gaussian elimination on much smaller matrices. For a thorough complexity analysis, see Sect. 5.3.

  2. (2)

    Our approach extends the notion of a spanning arborescence of a quiver to the setting of quiver representations, as follows. Our algorithm constructs a new quiver \(Q^+\) with new representation \(\mathbf {A}_\bullet ^+\). Here \(Q^+\) is an arborescence obtained by adjoining a new root vertex \(\rho \) to Q and passing to a spanning arborescence of this union, while \(\mathbf {A}^+_\bullet \) is a representation of \(Q^+\) with \(\mathbf {A}^+_v \subset \mathbf {A}_v\) for each non-root vertex v. Crucially, the space of sections \(\mathbf {A}^+_\rho \simeq \Gamma (Q^+;\mathbf {A}_\bullet ^+)\) is isomorphic to \(\Gamma (Q;\mathbf {A}_\bullet )\). We hope that our construction of the pair \((Q^+,\mathbf {A}^+_\bullet )\) will be of independent interest.

3 Sections of Strongly-Connected Quivers

A quiver \(Q = (s,t:E \rightarrow V)\) is called strongly-connected if for any pair of vertices \(v,v'\) in V there is at least one path from v to \(v'\). The simplest examples of strongly-connected quivers are cycles, but such quivers can be far more intricate. In this section, we study sections of strongly-connected quivers. We will use a particular decomposition of such quivers into a union of simpler quivers. To this end, note that a subquiver \(Q' \subset Q\) is a choice of subsets \(V' \subset V\) and \(E' \subset E\) so that the restrictions of s and t to \(E'\) take values in \(V'\). For example, every path \((e_1,\ldots ,e_k)\) in Q forms a subquiver with

$$\begin{aligned} V' = {\left\{ {s(e_1), \, s(e_2), \, \ldots , \, s(e_k), \, t(e_k) }\right\} } \, \text { and } \, E' = {\left\{ {e_1, \ldots , e_k }\right\} }, \end{aligned}$$

where \([k] = {\left\{ {1,\ldots ,k}\right\} }\). In the special case where a subquiver \(Q'\) comes from a path in Q, we define its source and tail \(s(Q')\) and \(t(Q')\) to be the source and tail of that path. Here is the decomposition of interest [6, Sec 5.3].

Definition 2.1

An ear decomposition \(Q_\bullet \) of Q is an ordered sequence of \(c \ge 1\) subquivers \({\left\{ {Q_i=(s_i,t_i:E_i \rightarrow V_i) \mid i \in [c]}\right\} }\) of Q subject to the following axioms:

  1. (1)

    the edge sets \(E_i\) partition E—in other words, they are mutually disjoint and their union equals E; moreover,

  2. (2)

    the quiver \(Q_1\) is either a single vertex or a cycle, while \(Q_i\) for each \(i > 1\) is a (possibly cyclic) path in Q; and finally,

  3. (3)

    for each \(i > 1\), the intersection of \(V_i\) with the union \(\bigcup _{j < i}V_j\) equals \({\left\{ {s(Q_i),t(Q_i)}\right\} }\); this intersection has cardinality 1 if \(Q_i\) is a cycle and cardinality 2 otherwise.

Ear decompositions play an important role in the study of strongly-connected quivers due to the following fundamental result.

Theorem 2.2

A quiver with at least two vertices is strongly-connected if and only if it has an ear decomposition.

At least one standard proof of this result is given in the form of an efficient algorithm for constructing ear decompositions—see [6, Theorem 5.3.2] for details. The figure below illustrates a strongly-connected subquiver of the quiver depicted in the Introduction along with its decomposition into three ears:

figure b

We assume for the remainder of this section that Q is strongly-connected and fix an ear decomposition \(Q_\bullet \) as in Definition 2.1. The depth of an edge \(e \in E\), denoted |e|, is the unique \(i \in [c]\) with \(e \in E_i\). We say that a path \(p = (e_1,\ldots ,e_k)\) is \(Q_\bullet \)-increasing if the \(|e_i|\) form a weakly increasing sequence, and define \(Q_\bullet \)-decreasing paths analogously. For each vertex \(v \in V\), we write \(\ell (v)\) for the smallest i in [c] such that \(v \in V_i\).

Proposition 2.3

Let \(\rho \) be any vertex in \(V_1\). For any vertex \(v \ne \rho \) in V, there exists

  1. (1)

    a unique \(Q_\bullet \)-increasing path p[v] from \(\rho \) to v with all edges of depth \(\le \ell (v)\), and

  2. (2)

    a unique \(Q_\bullet \)-decreasing path q[v] from v to \(\rho \) with all edges of depth \(\le \ell (v)\).

Proof

For \(\ell (v) = 1\), the desired conclusion follows immediately because \(Q_1\) must be a cycle by axiom (2) of Definition 2.1. Proceeding inductively, we assume that the assertion holds whenever \(\ell (v) < i\), and consider any \(v \in V_i\). Once again by axiom (2), our vertex v lies on a path \(Q_i\) from \(s(Q_i)\) to \(t(Q_i)\); and by axiom (3), the inductive hypothesis applies to both \(s(Q_i)\) and \(t(Q_i)\). The increasing path p[v] is built by first going from \(\rho \) to \(s(Q_i)\) along \(p[s(Q_i)]\) and then onward to v along \(Q_i\). Similarly, the decreasing path q[v] is built by concatenating the piece of \(Q_i\) which goes from v to \(t(Q_i)\) with the path \(q[t(Q_i)]\). \(\square \)

For each i in [c], the set \(E_i\) contains at most one edge \(\epsilon _i \in E_i\) whose target is \(t(Q_i)\); we allow for the possibility that \(\epsilon _1\) does not exist if \(Q_1\) has no edges, but all other \(\epsilon _i\) exist and are uniquely determined by the ear decomposition. We call \(\epsilon _i\) the i-th terminal edge with respect to the ear decomposition \(Q_\bullet \), and denote the set of all terminal edges by \(E_\text {ter} \subset E\).

Definition 2.4

The arborescence induced by \(Q_\bullet \) is the subquiver \(T = T(Q_\bullet )\) with vertex set V and edges \(E - E_\text {ter}\).

To confirm that T is an arborescence, note that its root vertex is \(\rho = s(Q_1)\), and that for any other vertex v there is a unique path p[v] from \(\rho \) to v, whose existence is guaranteed by Proposition 2.3. In the ear decomposition drawn above, the three terminal edges (with respect to the root vertex \(u_1\)) are replaced by dotted arcs in the figure below. The arborescence induced by \(Q_\bullet \) is obtained by removing these three edges:

figure c

Given a terminal edge \(\epsilon \in E_\text {ter}\), consider the linear map \(\Delta _\epsilon :\mathbf {A}_\rho \rightarrow \mathbf {A}_{t(\epsilon )}\) given by

$$\begin{aligned} \Delta _\epsilon = \mathbf {A}_{p[t(\epsilon )]} - \mathbf {A}_{\epsilon } \circ \mathbf {A}_{p[s(\epsilon )]}. \end{aligned}$$
(3)

The kernel of each such map is a subspace \(\ker \Delta _\epsilon \subset \mathbf {A}_\rho \). These kernels depend on the choice of ear decomposition \(Q_\bullet \) and the representation \(\mathbf {A}_\bullet \). Let us denote their intersection by

$$\begin{aligned} K(Q_\bullet ;\mathbf {A}_\bullet ) := \bigcap _\epsilon ~ \ker \Delta _\epsilon , \end{aligned}$$
(4)

where \(\epsilon \) ranges over \(E_\text {ter}\). This intersection of kernels is independent of the ear decomposition, since it is also the intersection of all \(\ker (\mathbf {A}_p - \mathbf {A}_q)\), where p and q are any two paths from \(\rho \) to the same vertex v. We have the following result.

Lemma 2.5

Let \(Q = (s,t:E \rightarrow V)\) be a strongly-connected quiver with ear decomposition \(Q_\bullet \). For any representation \(\mathbf {A}_\bullet \) of Q, there is an isomorphism

$$\begin{aligned} \Gamma (Q;\mathbf {A}_\bullet ) \simeq K(Q_\bullet ;\mathbf {A}_\bullet ) \end{aligned}$$

between the space of sections of \(\mathbf {A}_\bullet \) over Q and the intersection of the kernels from (4).

Proof

Let T be the arborescence induced by \(Q_\bullet \) and \(\rho \) its root vertex. Using Proposition 1.6, vectors in \(\mathbf {A}_\rho \) correspond bijectively with sections in \(\Gamma (T;\mathbf {A}_\bullet )\) via the assignment that sends each x in \(\mathbf {A}_\rho \) to the section given by

$$\begin{aligned} v \mapsto \gamma _v = \mathbf {A}_{p[v]}(x). \end{aligned}$$

The subspace \(\Gamma (Q;\mathbf {A}_\bullet ) \subset \Gamma (T;\mathbf {A}_\bullet )\), is obtained by additionally enforcing compatibility across the edges in \(E_\text {ter}\). Let \(\epsilon \) be a terminal edge and \(x_\rho \) a vector in \(\mathbf {A}_\rho \). Now the section \(v \mapsto \mathbf {A}_{p[v]}(x_\rho )\) of \(\Gamma (T;\mathbf {A}_\bullet )\) satisfies the compatibility requirement \(\mathbf {A}_\epsilon (x_{s(\epsilon )}) = x_{t(\epsilon )}\) across \(\epsilon \) if and only if \(x_\rho \) lies in the kernel of the map \(\Delta _\epsilon \) from (3). Thus, our \(x_\rho \)-induced section is compatible across all the terminal edges if and only if \(x_\rho \) lies in \(K(Q_\bullet ;\mathbf {A}_\bullet )\). \(\square \)

We may safely combine this result with Proposition 1.6 to reduce a strongly-connected quiver to an arborescence while preserving the space of sections.

Corollary 2.6

Assuming the hypotheses of Lemma 2.5, let T be the arborescence induced by \(Q_\bullet \) and \(\rho \) its root vertex. Let \(\mathbf {A}'_\bullet \) be the representation of T prescribed by the following assignments to vertices \(v \in V\) and non-terminal edges \(e \in E - E_\text {ter}\):

$$\begin{aligned} \mathbf {A}'_v := {\left\{ \begin{array}{ll} \mathbf {A}_v &{} v \ne \rho ,\\ K(Q_\bullet ;\mathbf {A}_\bullet ) &{} v = \rho ; \end{array}\right. } \quad \text { and } \quad \mathbf {A}'_e := {\left\{ \begin{array}{ll} \mathbf {A}_e &{} s(e) \ne \rho ,\\ \mathbf {A}_e\big |_{K(Q_\bullet ;\mathbf {A}_\bullet )} &{} s(e)=\rho . \end{array}\right. } \end{aligned}$$

Then, there is an isomorphism of sections

$$\begin{aligned} \Gamma (Q;\mathbf {A}_\bullet ) \simeq \Gamma (T;\mathbf {A}'_\bullet ). \end{aligned}$$

We will use Corollary 2.6 to perform section-preserving simplifications of arbitrary (i.e. not necessarily strongly-connected) quivers.

4 The Acyclic Reduction

Fix a quiver \(Q = (s,t:E \rightarrow V)\). A strongly-connected subquiver \(R \subset Q\) is maximal if it is not contained in a strictly larger strongly-connected subquiver of Q. We denote the set of all maximal strongly-connected subquivers of Q by \(\mathbf{MSC} (Q)\). This set can be extracted from Q very efficiently by employing the remarkable algorithm of Tarjan [6, Section 5.2]. Distinct subquivers in \(\mathbf{MSC} (Q)\) have disjoint vertices.Footnote 1 For each R in \(\mathbf{MSC} (Q)\), fix an ear decomposition \(R_\bullet \) as in Definition 2.1. We write \(T(R_\bullet )\) for the arborescence induced by \(R_\bullet \) as in Definition 2.4, and let \(E_\text {ter}(R_\bullet ) \subset E\) be the set of terminal edges of \(R_\bullet \).

Definition 3.1

The acyclic reduction \(Q^*\) of \(Q =(s,t:E \rightarrow V)\) with respect to the ear decompositions \({\left\{ {R_\bullet \mid R \in \mathbf{MSC} (Q)}\right\} }\) is the subquiver \(Q^* \subset Q\) defined as follows: it has the same vertex set V, while its edge set \(E^* \subset E\) is given by removing all terminal edges, i.e.

$$\begin{aligned} E^* = E - \bigcup _{R} E_\text {ter}(R_\bullet ), \end{aligned}$$

where R ranges over \(\mathbf{MSC} (Q)\).

We note that the quiver \(Q^*\) is indeed acyclic (as suggested by its name) as follows. Each cycle in Q is strongly-connected, hence lies in a single maximal strongly-connected component \(R \in \mathbf{MSC} (Q)\). But the removal of all the terminal edges \(E_\text {ter}(R_\bullet )\) turns R into the arborescence \(T(R_\bullet )\), which cannot contain any cycles. Depicted below is the quiver from the Introduction; the light-shaded edges lie within strongly-connected subquivers, whose root vertices are coloured red. The dotted edges are terminal for the associated ear decompositions, and their removal produces the acyclic reduction:

figure d

Our next goal is to reduce a given representation \(\mathbf {A}_\bullet \) of Q to a new representation \(\mathbf {A}_\bullet ^*\) of \(Q^*\) in a manner that preserves the space of sections. Let \(\rho : \mathbf{MSC} (Q) \rightarrow V\) be the injective root map, which sends each maximal strongly-connected subquiver \(R \subset Q\) to the root vertex of \(T(R_\bullet )\). We associate to each vertex \(v \in V\) the subspace \(\mathbf {A}^\circ _v \subset \mathbf {A}_v\) given by

$$\begin{aligned} \mathbf {A}^\circ _v := {\left\{ \begin{array}{ll} \mathbf {A}^R_{v} &{} \text {if } v = \rho (R) \text { for some } R \in \mathbf{MSC} (Q), \\ \mathbf {A}_v &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$

where, for each \(R \in \mathbf{MSC} (Q)\), we write \(\mathbf {A}^R_\bullet \) for the representation of \(T(R_\bullet )\) described in Corollary 2.6.

Definition 3.2

For each vertex \(v \in V\) and strongly-connected \(R \in \mathbf{MSC} (Q)\), let \(P^*_{v\rightarrow R}\) be the set of all paths in \(Q^*\) with source v and target \(\rho (R)\). The R -constrained space at v is the subspace \(\Lambda _{v,R} \subset \mathbf {A}^\circ _v\) given by

$$\begin{aligned} \Lambda _{v,R} := {\left\{ {x \in \mathbf {A}^\circ _v \mid \mathbf {A}_p(x) \in \mathbf {A}^R_{\rho (R)} \text { for all } p \in P^*_{v\rightarrow R}}\right\} }, \end{aligned}$$

with the implicit understanding that \(\Lambda _{v,R}\) equals \(\mathbf {A}^\circ _v\) whenever \(P^*_{v\rightarrow R}\) is empty.

Our next result shows that R-constrained subspaces behave well under the linear maps assigned by \(\mathbf {A}_\bullet \) to edges of \(Q^*\).

Proposition 3.3

For any edge e in \(E^*\) and subquiver \(R \in \mathbf{MSC}(Q)\), the linear map \(\mathbf {A}_e:\mathbf {A}_{s(e)} \rightarrow \mathbf {A}_{t(e)}\) sends \(\Lambda _{s(e),R}\) to \(\Lambda _{t(e),R}\).

Proof

Let \(p = (e_1,\ldots ,e_k)\) be any path in \(P^*_{t(e) \rightarrow R}\) and note that the augmented path \(p' = (e,e_1,\ldots ,e_k)\) is an element of \(P^*_{s(e) \rightarrow R}\). Now for any x in \(\Lambda _{s(e),R}\) we know that \(\mathbf {A}_{p'}(x)\) lies in \(\mathbf {A}^R_{\rho (R)}\) by Definition 3.2. But \(\mathbf {A}_{p'}(x)\) is \(\mathbf {A}_p \circ \mathbf {A}_e(x)\), whence \(\mathbf {A}_e(x)\) lies in \(\Lambda _{t(e),R}\). \(\square \)

Consider the intersection of all the R-constrained spaces at a given vertex \(v \in V\), i.e. define the subspace \(\Lambda _v \subset \mathbf {A}_v\) as

$$\begin{aligned} \Lambda _v := \bigcap _{R} \Lambda _{v,R} \end{aligned}$$
(5)

where R ranges over \(\mathbf{MSC} (Q)\). It follows immediately from Proposition 3.3 that for each edge e in \(E^*\) the map \(\mathbf {A}_e\) sends \(\Lambda _{s(e)}\) to \(\Lambda _{t(e)}\).

Definition 3.4

Let \(Q^*\) be the acyclic reduction of Q with respect to a choice of ear decompositions \({\left\{ {R_\bullet \mid R \in \mathbf{MSC} (Q)}\right\} }\). The acyclification of a representation \(\mathbf {A}_\bullet \) of Q is a new representation \(\mathbf {A}_\bullet ^*\) of \(Q^*\) which assigns to every vertex v in V the vector space

$$\begin{aligned} \mathbf {A}^*_v = \Lambda _v \end{aligned}$$

and to every edge e in \(E^*\) the restriction of \(\mathbf {A}_e\) to \(\Lambda _{s(e)}\), denoted \(\mathbf {A}^*_e:\Lambda _{s(e)} \rightarrow \Lambda _{t(e)}\).

As promised, our new representation \(\mathbf {A}^*_\bullet \) retains full knowledge of the sections of the original representation \(\mathbf {A}_\bullet \) even though it is only defined on the acyclic reduction \(Q^*\).

Proposition 3.5

Let \(\mathbf {A}_\bullet \) be a representation of a quiver \(Q = (s,t:E \rightarrow V)\). Writing \(Q^*\) for the acyclic reduction of Q with respect to some choice of ear decompositions \({\left\{ {R_\bullet \mid R \in \mathbf{MSC}(Q)}\right\} }\) and \(\mathbf {A}_\bullet ^*\) for the corresponding acyclification of \(\mathbf {A}_\bullet \), there is an isomorphism of sections

$$\begin{aligned} \Gamma (Q;\mathbf {A}_\bullet ) \simeq \Gamma (Q^*;\mathbf {A}^*_\bullet ). \end{aligned}$$

Proof

First we show that a section \(\gamma \) in \(\Gamma (Q;\mathbf {A}_\bullet )\) gives a section in \(\Gamma (Q^*;\mathbf {A}^*_\bullet )\). Since \(E^* \subset E\) by Definition 3.1, it suffices to prove that \(\gamma _v\) lies in the subspace \(\Lambda _v\) of \(\mathbf {A}_v\) for all vertices v in V. Since \(\gamma \) restricts to a section in \(\Gamma (R;\mathbf {A}_\bullet )\) for every subquiver \(R \in \mathbf{MSC} (Q)\), it follows from Corollary 2.6 that \(\gamma _{\rho (R)}\) lies in the subspace \(\mathbf {A}^R_{\rho (R)}\) of \(\mathbf {A}_{\rho (R)}\). Thus, for any vertex \(v \in V\) and every path p in \(P^*_{v\rightarrow R}\), compatibility forces \(\mathbf {A}_p(\gamma _v) \in \mathbf {A}^R_{\rho (R)}\). Thus, \(\gamma _v\) must lie in the subspace \(\Lambda _v\) from (5). Now consider any edge \(e \in E^*\) and note that \(\mathbf {A}^*_e\) is defined simply by restricting \(\mathbf {A}_e\) to the subspace \(\Lambda _{s(e)}\). Thus, we obtain

$$\begin{aligned} \mathbf {A}^*_e(\gamma _{s(e)}) = \mathbf {A}_e(\gamma _{s(e)}) = \gamma _{t(e)} \end{aligned}$$

for each such edge, and it follows that \(\gamma \) is a section in \(\Gamma (Q^*;\mathbf {A}^*_\bullet )\). Conversely, consider a section \(\gamma ^*\) in \(\Gamma (Q^*;\mathbf {A}^*_\bullet )\). The \(\mathbf {A}_\bullet \)-compatibility of \(\gamma ^*\) across every edge \(e \in E^*\) follows from the fact that \(\mathbf {A}^*_e\) is the restriction of \(\mathbf {A}_e\); it therefore suffices to show that \(\gamma ^*\) is also \(\mathbf {A}_\bullet \)-compatible across all the edges in \(E - E^*\). By Definition 3.1, any such edge \(\epsilon \) lies in \(E_\text {ter}(R_\bullet )\) for a unique \(R \in \mathbf{MSC} (Q)\). We know that \(\Lambda _{\rho (R)}\) is a subspace of \(\mathbf {A}^R_{\rho (R)}\), by (5) combined with Definition 3.2. Thus, Corollary 2.6 guarantees that \(\gamma ^*\) is also \(\mathbf {A}_\bullet \)-compatible across \(\epsilon \), as desired. \(\square \)

5 The Arboreal Replacement

We assume here that \(Q = (s,t:E \rightarrow V)\) is an acyclic quiver, so its vertex set V is partially ordered by (the reflexive closure of) the binary relation

$$\begin{aligned} u < v \text { if and only if there is a path }p \text { in } Q \text { with } s(p) = u \text { and } t(p) = v. \end{aligned}$$

Let \(V_\text {min} \subset V\) be set of all minimal vertices with respect to this partial order—thus, a vertex v lies in \(V_\text {min}\) if and only if there is no edge \(e \in E\) with \(t(e) = v\). We fix a representation \(\mathbf {A}_\bullet \) of Q, and seek to compute the space of sections \(\Gamma (Q;\mathbf {A}_\bullet )\). For this purpose, it will be convenient to formally add a new vertex to Q that serves as the global minimum for the partial order described above.

Definition 4.1

The augmented quiver \(Q^+\) has vertices \(V^+ := V \cup {\left\{ {\rho }\right\} }\), where \(\rho \) is a new vertex. Its edge set \(E^+\) is \(E \cup {\left\{ {e_v \mid v \in V_\text {min}}\right\} }\); the sources and targets of edges in E are inherited from Q, while each new edge \(e_v\) has source \(\rho \) and target v in \(V_\text {min}\).

Drawn below is the augmented quiver corresponding to the acyclic reduction from the previous section; the root and (two) new edges \(e_v\) for the vertices \(v \in V_\text {min}\) are highlighted in blue.

figure e

A representation \(\mathbf {A}_\bullet \) extends to \(Q^+\) if we define

$$\begin{aligned} \mathbf {A}_\rho := \prod _{v \in V_\text {min}} \mathbf {A}_v, \end{aligned}$$

and let \(\mathbf {A}_{e_v}:\mathbf {A}_\rho \rightarrow \mathbf {A}_v\) be the canonical projection map. Now each section of \(\mathbf {A}_\bullet \) over Q extends uniquely to a section over \(Q^+\), whence we have an isomorphism

$$\begin{aligned} \Gamma (Q;\mathbf {A}_\bullet ) \simeq \Gamma (Q^+;\mathbf {A}_\bullet ). \end{aligned}$$
(6)

Thus, there is no loss of generality encountered when computing the sections of \(\mathbf {A}_\bullet \) over \(Q^+\) rather than Q. We will also make frequent use of the following notion.

Definition 4.2

Let \(n \ge 1\) be a natural number and XY a pair of vector spaces. The equaliser of a collection of n linear maps \({\left\{ {f_i:X \rightarrow Y \mid 1 \le i \le n}\right\} }\) is the largest subspace \(\text {Eq}{\left\{ {f_\bullet }\right\} } \subset X\) satisfying \(f_i(x) = f_j(x)\) for all x in \(\text {Eq}{\left\{ {f_\bullet }\right\} }\) and all ij in \({\left\{ {1,\ldots ,n}\right\} }\).

In practice, for finite-dimensional X the equaliser \(\text {Eq}{\left\{ {f_\bullet }\right\} }\) can be computed by intersecting kernels of successive differences:

$$\begin{aligned} \text {Eq}{\left\{ {f_\bullet }\right\} } = \bigcap _{i=1}^{n-1} \ker \left( f_i - f_{i+1}\right) , \end{aligned}$$

with the understanding that for \(n = 1\) this intersection over the empty set equals all of X.

Definition 4.3

Assign to each vertex \(v \in V^+\) a subspace \(\Phi _v \subset \mathbf {A}_\rho \) and a linear map \(\phi _v: \Phi _v \rightarrow \mathbf {A}_v\), called the flow space and flow map of \(\mathbf {A}_\bullet \) at v, inductively over the partial order \(\le \) as follows:

  1. (1)

    for \(v = \rho \), the flow space \(\Phi _\rho \) equals \(\mathbf {A}_\rho \), and the flow map \(\phi _\rho :\Phi _\rho \rightarrow \mathbf {A}_\rho \) is the identity;

  2. (2)

    for \(v \ne \rho \), let \(E_\text {in}(v) \subset E^+\) be the (necessarily nonempty) set of all edges e satisfying \(t(e) = v\). Noting that \(s(e) < v\) for any such e, define the subspace \(\Phi '_v \subset \mathbf {A}_\rho \) via

    $$\begin{aligned} \Phi '_v := \bigcap _{e} \Phi _{s(e)}, \end{aligned}$$

    where e ranges over \(E_\text {in}(v)\). For each such e, the composition \(\mathbf {A}_{e} \circ \phi _{s(e)}\) restricts to a linear map \(\Phi '_v \rightarrow \mathbf {A}_v\). The flow space at v is the equaliser

    $$\begin{aligned} \Phi _v := \text {Eq}{\left\{ {\mathbf {A}_{e} \circ \phi _{s(e)}:\Phi '_v \rightarrow \mathbf {A}_v \mid e \in E_\text {in}(v)}\right\} }. \end{aligned}$$

    The flow map \(\phi _v:\Phi _v \rightarrow \mathbf {A}_v\) is given by \(\mathbf {A}_{e} \circ \phi _{s(e)}\) for any e in \(E_\text {in}(v)\).

By construction, the flow space \(\Phi _v\) for a vertex \(v \ne \rho \) forms a subspace of the intersection \(\bigcap _{u}\Phi _{u}\) of flow spaces ranging over all preceding vertices \(u < v\). Thus, we can restrict the flow map \(\phi _{u}\) at u to a vector in the flow space \(\Phi _v\) whenever \(u \le v\). Our affinity for flow spaces and maps stems mainly from the following result.

Proposition 4.4

For each vertex \(v \in V^+\), let \(Q^+_{\le v}\) be the subquiver of \(Q^+\) generated by all vertices \(u \le v\) and the edges between them. Then, \(\gamma \) is a section in \(\Gamma (Q^+_{\le v};\mathbf {A}_\bullet )\) if and only if the vector \(\gamma _\rho \in \mathbf {A}_\rho \) lies in the flow space \(\Phi _v\).

Proof

For \(v = \rho \) the result holds because in this case the spaces below are all equal:

$$\begin{aligned} \Phi _\rho = \Gamma (Q^+_{\le \rho };\mathbf {A}_\bullet ) = \mathbf {A}_\rho , \end{aligned}$$

with the flow map \(\phi _\rho :\Phi _\rho \rightarrow \mathbf {A}_\rho \) being the identity. Proceeding inductively over the partial order \(\le \), consider any \(v \ne \rho \) and assume that the desired result holds for all preceding vertices \(u < v\). We must show that any \(x \in \Phi _v\) generates a section in \(\Gamma (Q^+_{\le v};\mathbf {A}_\bullet )\) via the assignment \(u \mapsto \phi _u(x)\) for every \(u \le v\). Compatibility for all edges e with \(t(e) \ne v\) follows from the inductive hypothesis, so it suffices to examine all edges \(e \in E_\text {in}(v)\). For any such edge, Definition 4.3 yields

$$\begin{aligned} \mathbf {A}_e \circ \phi _{s(e)}(x) = \phi _v(x), \end{aligned}$$

hence establishing the desired compatibility. Conversely, if \(\gamma \) is a section in \(\Gamma (Q^+_{\le v};\mathbf {A}_\bullet )\) then it suffices to show that the vector \(\gamma _\rho \in \mathbf {A}_\rho \) lies in the subspace \(\Phi _v\). By the inductive hypothesis, we have

$$\begin{aligned} \gamma _\rho \in \Phi '_v = \bigcap _e \Phi _{s(e)}, \end{aligned}$$

where e ranges over the edges in \(E_\text {in}(v)\). By compatibility of \(\gamma \) across any such e, we have

$$\begin{aligned} \mathbf {A}_e\circ \phi _{s(e)}(\gamma _{\rho }) = \gamma _v, \end{aligned}$$

so \(\gamma _\rho \) lies in the equaliser \(\Phi _v = \text {Eq}{\left\{ {\mathbf {A}_e \circ \phi _{s(e)} \mid e \in E_\text {in}(v)}\right\} }\) as desired. \(\square \)

Using the fact that the quiver \(Q^+\) is the union of the subquivers \({\left\{ {Q^+_{\le v} \mid v \in V^+}\right\} }\), we are able to describe the sections of \(\mathbf {A}_\bullet \) as intersections of its flow spaces. We write \(V_\text {max} \subset V\) for the \(\le \)-maximal vertices (i.e. the vertices which do not serve as sources of edges in \(E^+\)).

Proposition 4.5

Let \(\mathbf {A}_\bullet \) be a representation of an acyclic quiver Q, and \(Q^+\) the augmented quiver (as in Definition 4.1). We have an isomorphism

$$\begin{aligned} \Gamma (Q;\mathbf {A}_\bullet ) \simeq \bigcap _{v \in V_\text {max}} \Phi _v \end{aligned}$$

between the sections of \(\mathbf {A}_\bullet \) over Q and the intersection of the flow spaces of \(\mathbf {A}_\bullet \) at the maximal vertices.

Proof

Combining (6) with Proposition 4.4 and the fact that \(Q^+ = \bigcup _{v \in V} Q^+_{\le v}\) gives

$$\begin{aligned} \Gamma (Q;\mathbf {A}_\bullet ) \simeq \bigcap _{v \in V} \Phi _v. \end{aligned}$$

Since maximal vertices have the smallest flow spaces, by Definition 4.3, the desired result follows. \(\square \)

For brevity, we write \(\Phi (\mathbf {A}_\bullet )\) to indicate the intersection \(\bigcap _v \Phi _v\) of flow spaces ranging over \(V_\text {max}\) (or, equivalently, over V). By employing breadth-first search [6, Chapter 3.3] on \(Q^+\) starting at \(\rho \), one can construct a spanning arborescence \(T^+ \subset Q^+\) with root \(\rho \). This arborescence \(T^+\) must necessarily contain all the vertices in \(V^+\) and all the edges in \((E^+ - E)\), but in general it is not uniquely determined otherwise. One possible spanning arborescence for the augmented quiver drawn above is obtained by removing the light-shaded edges below:

figure f

Definition 4.6

Let \(T^+ \subset Q^+\) be any spanning arborescence with root \(\rho \). An arboreal replacement of \(\mathbf {A}_\bullet \) is the representation \(\mathbf {A}^+_\bullet \) of \(T^+\) that assigns

$$\begin{aligned} \mathbf {A}^+_v := {\left\{ \begin{array}{ll} \mathbf {A}_v &{} v \ne \rho ,\\ \Phi (\mathbf {A}_\bullet ) &{} v = \rho ; \end{array}\right. } \quad \text { and } \quad \mathbf {A}^+_e := {\left\{ \begin{array}{ll} \mathbf {A}_e &{} s(e) \ne \rho ,\\ \mathbf {A}_e\big |_{\Phi (\mathbf {A}_\bullet )} &{} s(e)=\rho . \end{array}\right. } \end{aligned}$$

The following result is obtained by combining Proposition 4.5 with Proposition 1.6.

Corollary 4.7

Let \(\mathbf {A}_\bullet \) be a representation of an acyclic quiver Q and \(T^+\) a spanning arborescence of the augmented quiver \(Q^+\). There is an isomorphism

$$\begin{aligned} \Gamma (Q;\mathbf {A}_\bullet ) \simeq \Gamma (T^+;\mathbf {A}^+_\bullet ) \end{aligned}$$

between the sections of \(\mathbf {A}_\bullet \) and those of its arboreal replacement \(\mathbf {A}_\bullet ^+\) defined on \(T^+\).

6 The Space of Sections

We are now ready to establish Theorem (A) from the Introduction.

Theorem 5.1

For any representation \(\mathbf {A}_\bullet \) of a quiver \(Q = (s,t:E \rightarrow V)\), the following spaces are all isomorphic:

$$\begin{aligned} \Gamma (Q;\mathbf {A}_\bullet ) \simeq \Gamma (Q^*;\mathbf {A}^*_\bullet ) \simeq \Gamma (T^+;\mathbf {A}^+_\bullet ) \simeq \mathbf {A}^+_\rho . \end{aligned}$$
(7)

Here, \(Q^*\) is the acyclic reduction of Q with \(\mathbf {A}_\bullet ^*\) the acyclification of \(\mathbf {A}_\bullet \). Similarly, writing \(Q^+\) for the augmented quiver associated to \(Q^*\) with root \(\rho \), the representation \(\mathbf {A}_\bullet ^+\) is the arboreal replacement of \(\mathbf {A}_\bullet ^*\) defined on any spanning arborescence \(T^+ \subset Q^+\).

Proof

The first isomorphism follows from Proposition 3.5, the second from Corollary 4.7, and the third from Proposition 1.6. \(\square \)

Theorem (A) asserts the existence of an isomorphism \(\mathbf {A}^+_\rho \simeq \Gamma (Q;\mathbf {A}_\bullet )\) as a map F, which we now describe. Assuming the hypotheses and notation of Theorem 5.1, there are containments

$$\begin{aligned} \mathbf {A}^+_v \subset \mathbf {A}^*_v \subset \mathbf {A}_v, \end{aligned}$$

for each vertex v in V, by Definitions 3.4 and 4.6. Since \(T^+\) is an arborescence, it admits a unique path p[v] from its root \(\rho \) to any such v. This path carries a linear map \(\mathbf {A}^+_{p[v]}:\mathbf {A}^+_\rho \rightarrow \mathbf {A}^+_v\), and the collection of all such linear maps (indexed over \(v \in V\)) assembles to furnish a single map to the direct product:

$$\begin{aligned} \mathbf {A}^+_\rho \rightarrow \prod _v \mathbf {A}^+_v. \end{aligned}$$

In light of the containments \(\mathbf {A}^+_v \subset \mathbf {A}_v\) described above, the codomain is a subspace of \(\text {Tot}(\mathbf {A}_\bullet )\). Thus, we obtain a linear map

$$\begin{aligned} F:\mathbf {A}^+_\rho \rightarrow \text {Tot}(\mathbf {A}_\bullet ), \end{aligned}$$
(8)

whose image of F inside \(\text {Tot}(\mathbf {A}_\bullet )\) is an isomorphically embedded copy of \(\Gamma (Q;\mathbf {A}_\bullet )\). Although various choices (of ear decompositions and spanning arborescences) made above will produce different F’s, the image of F remains invariant.

6.1 Lower Bounds on the Dimension

As stated in the Introduction, we will define principal components along \(\mathbf {A}_\bullet \) as solutions to an optimisation problem over \(\Gamma (Q;\mathbf {A}_\bullet )\). In order for this to be a nontrivial problem, one requires the dimension \(d := \dim \Gamma (Q;\mathbf {A}_\bullet )\) to exceed zero. We therefore take a brief detour here in order to highlight some sufficient conditions (on Q and \(\mathbf {A}_\bullet )\) which give lower bounds on d. Among the simplest cases to analyse in terms of the topology of Q are the extreme ones, as recorded in the following observation.

Proposition 5.2

Let \(\mathbf {A}_\bullet \) be a representation of a quiver Q.

  1. (1)

    if Q is an arborescence with root \(\rho \), then \(d = \dim \mathbf {A}_\rho \); and

  2. (2)

    if Q is strongly-connected, then \(d = 0\) for all sufficiently generic \(\mathbf {A}_\bullet \).

Proof

The first assertion follows directly from Proposition 1.6, so we concentrate on the second assertion. Let \(Q_\bullet \) be an ear decomposition of Q (see Definition 2.1) and \(\rho \) a vertex in \(Q_1\). By strong connectedness, there exists a path p in Q from \(\rho \) to itself, which carries an endomorphism \(\mathbf {A}_p:\mathbf {A}_\rho \rightarrow \mathbf {A}_\rho \). Now any section \(\gamma \) of Q must satisfy \(\mathbf {A}_p(\gamma _\rho ) = \gamma _\rho \). For generic \(\mathbf {A}_\bullet \), this endomorphism \(\mathbf {A}_p\) will not have 1 as an eigenvalue, so \(\gamma _\rho \) must be zero. The result now follows from applying Proposition 1.6 to the arborescence induced by \(Q_\bullet \). \(\square \)

Although the result in part (2) of Proposition 5.2 might appear disappointing at first glance, we note that there are several interesting nongeneric families of linear maps which do admit 1 as an eigenvalue, such as those arising from row-stochastic matrices. Moreover, general quivers are neither strongly-connected nor arboreal but lie somewhere in between. Using Proposition 3.5, any representation of an arbitrary quiver can be reduced to a representation of an acyclic quiver while preserving d, so it remains to provide lower bounds on d for representations of acyclic quivers.

Proposition 5.3

Let Q be an acyclic quiver with minimal vertices \(V_{\text { min}}\) and maximal vertices \(V_{\text {max}}\). For any representation \(\mathbf {A}_\bullet \) of Q, we have

$$\begin{aligned} \dim \Gamma (Q; \mathbf {A}_\bullet ) \ge \sum _{u \in V_\mathrm{min}} \dim \mathbf {A}_u - \sum _{v \in V_\mathrm{max}} (n_v-1) \dim \mathbf {A}_v, \end{aligned}$$

where \(n_v\) is the total number of paths in the augmented quiver \(Q^+\) from the root \(\rho \) to the vertex v.

Proof

By Definition 4.3, the flow space \(\Phi _\rho = \mathbf {A}_\rho \) has dimension \(\sum _{u \in V_{min}} \dim \mathbf {A}_u\). We claim that the flow space \(\Phi _v\) at a vertex \(v \in V_\text {max}\) has codimension at most \((n_v - 1) \dim \mathbf {A}_v\) in \(\Phi _\rho \). To establish this claim, let \({\left\{ {f_k:\mathbf {A}_\rho \rightarrow \mathbf {A}_v \mid 1 \le k \le n_v}\right\} }\) be the linear maps carried by paths from \(\rho \) to v, and examine the \((n_v-1)\) kernels of the differences \(\Delta _k = (f_k - f_{k+1}) \). Since each \(\ker (\Delta _k)\) has codimension at most \(\dim \mathbf {A}_v\) in \(\Phi _\rho \), and since the codimension of their intersection is at most the sum of these codimensions, we have \(\text {codim } \Phi _v \le (n_v - 1) \dim \mathbf {A}_v\) as claimed. The inequality in the statement now follows from Proposition 4.5. \(\square \)

Remark 5.4

The space of sections might be trivial for several interesting representations of acyclic quivers, Proposition 5.3 notwithstanding. For example, this occurs frequently in two parameter persistence modules [9] which arise from homology groups of bifiltered simplicial complexes. Such a module is a representation \(\mathbf {A}_\bullet \) of the grid quiver Q whose vertices are identified with integer points (ij) with \(1 \le i,j \le \ell \) for some integer \(\ell > 0\); there are two edges from each (ij), one to \((i+1,j)\) and another to \((i,j+1)\).

Since homology is functorial, each square of the form

commutes. It follows that the space of sections of such a quiver representation is isomorphic to \(\mathbf {A}_{(1,1)}\), which might be trivial even though the other \(\mathbf {A}_{(i,j)}\) and the linear maps between them contain relevant information. As a partial remedy, one can fix a vertex \((i_0,j_0)\) of interest and restrict to the largest subquiver \(Q_{\ge (i_0,j_0)} \subset Q\) containing all vertices (ij) with \(i_0 \ge i\) and \(j_0 \ge j\). This allows us to extract features from representations of Q (as in Sect. 6) even when the space of sections is trivial.

6.2 Algorithms

We describe algorithms to compute the space of sections by combining graph theoretic operations on the quiver with linear algebraic operations on the representation. That is, we give algorithms arising from Corollary 2.6, Proposition 4.5, and Corollary 4.7. Quiver representations \(\mathbf {A}_\bullet \) may be stored on computers as directed graphs whose vertices v have non-negative integer weights \(\dim \mathbf {A}_v\) and whose edges e have matrix-valued weights \(\mathbf {A}_e\).

The first subroutine implements the constructions from Sect. 2: it ear-decomposes a given strongly-connected quiver, produces an arborescence by removing all terminal edges, and updates the overlaid representation \(\mathbf {A}_\bullet \) at the root vertex in accordance with Corollary 2.6. We recall that an efficient algorithm for performing ear decomposition may be found in [6, Section 5.3].

figure g

The second subroutine implements the constructions from Sect. 3 by computing the acyclification of a given representation \(\mathbf {A}_\bullet \) of a quiver Q. It uses Tarjan’s efficient algorithm for computing the set \(\mathbf{MSC} (Q)\) of maximal strongly-connected components [6,  Section 5.2]. The BFSEqualise function invoked in line 5 is an enhancement of the standard breadth-first search algorithm to do the following computation. Starting from the root \(\rho \) of a given \(R \in \mathbf{MSC} (Q)\), it finds all edges e with target \(\rho \), and replaces each vector space \(\mathbf {A}_{s(e)}\) with the R-constrained subspace \(\Lambda _{s(e),R}\) from Definition 3.2. It then recursively repeats this operation, starting from s(e) rather than \(\rho \), until all vertices that admit paths to \(\rho \) have been processed.

figure h

Our final subroutine is based on Sect. 4. It takes as input a representation of an acyclic quiver (such as one produced by AcycReduce). The algorithm augments the quiver with a new root and inductively builds the arboreal replacement by constructing flow spaces and maps (see Definition 4.3). The subroutine Augment builds \(Q^+\) from Q (as in Definition 4.1) and extends the representation \(\mathbf {A}_\bullet \) to \(\mathbf {A}^+_\bullet \) by letting \(\mathbf {A}^+_\rho \) be the product of \(\mathbf {A}_v\) over initial vertices v. The function TopSort builds a linear ordering of the vertices that respects the path-induced partial order (this is often called a topological sorting in the graph theory literature). Finally, the function SpanArb uses breadth-first search to construct a spanning arborescence \(T^+ \subset Q^+\) with root \(\rho \).

figure i

To compute the space of global sections \(\Gamma (Q;\mathbf {A}_\bullet )\), we invoke

$$\begin{aligned} \mathbf{ArbReplace} \big (\mathbf{AcycReduce} (Q,\mathbf {A})\big ) \end{aligned}$$
(9)

This produces a representation \(\mathbf {A}^+\) of an arborescence \(T^+\) so, by Proposition 1.6, the vector space \(\mathbf {A}^+_\rho \) at the root vertex \(\rho \) yields the space of sections \(\Gamma (Q;\mathbf {A}_\bullet )\). At each nonroot vertex v of \(T^+\), the output vector space \(\mathbf {A}^+_v\) is a subspace of the original \(\mathbf {A}_v\). Thus, we can compute an embedding \(\Gamma (Q;\mathbf {A}_\bullet ) \hookrightarrow \text {Tot}(\mathbf {A}_\bullet )\): the component \(\Gamma (Q;\mathbf {A}_\bullet ) \hookrightarrow \mathbf {A}_v\) for vertex v equals \(\mathbf {A}^+_{p[v]}\), where p is the unique path in \(T^+\) from \(\rho \) to v.

6.3 Computational Complexity

Let \(Q = (s,t:E \rightarrow V)\) be a quiver with \(n_V\) vertices and \(n_E\) edges. Fix a representation \(\mathbf {A}_\bullet \) of Q with \(n_\mathbf {A}:= \max _v{\left\{ {\dim \mathbf {A}_v}\right\} }\). We assume throughout that scalar operations in the underlying field take O(1) time.

Remark 5.5

Fix a basis for each vector space \(\mathbf {A}_v\), so the linear maps \(\mathbf {A}_e:\mathbf {A}_{s(e)} \rightarrow \mathbf {A}_{t(e)}\) can be expressed as matrices. Ordering the vertices and edges of Q arbitrarily, let \(M = M(\mathbf {A}_\bullet )\) be the block matrix whose column blocks are indexed by vertices \(v \in V\), row blocks are indexed by edges \(e \in E\), and whose (ev)-block is

$$\begin{aligned} M_{e,v} := {\left\{ \begin{array}{ll} -\mathbf {A}_e &{} \text {if } v = s(e) \\ \text {Id}_{\mathbf {A}_v} &{} \text {if } v = t(e) \\ 0 &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$

The subspace \(\Gamma (Q;\mathbf {A}_\bullet )\) is the kernel of M, cf. Remark 1.7. Thus, computing a basis for this space naïvely requires Gaussian elimination on the augmented matrix

$$\begin{aligned} M' := \left[ {\text {Id}}_{{\text {Tot}}({\mathbf {A}}_\bullet )} \, \mid \, M^{\mathsf {T}}\right] . \end{aligned}$$

In the worst case, \(M'\) has \(n_Vn_\mathbf {A}\) rows and \(2n_En_\mathbf {A}\) columns. Since we may need up to \(O(n_V^2n_\mathbf {A}^2)\) row operations, and since each such operation incurs a cost of \(O(n_En_\mathbf {A})\), the time complexity is \(O(n_V^2n_En_\mathbf {A}^3)\).

Here we establish the following.

Corollary 5.6

The algorithms from Sect. 5.2 invoked using (9) extract a basis for \(\Gamma (Q;\mathbf {A}_\bullet )\) in time

$$\begin{aligned} O\left( n_V(n_V+n_E)n_\mathbf {A}^3\right) . \end{aligned}$$

We prove Corollary 5.6 in three parts. The first step gives the complexity of SCReduce.

Lemma 5.7

If Q is strongly-connected and admits an ear decomposition with \(\ell \) ears, then the subroutine SCReduce has time complexity \(O(n_V+n_E+n_\mathbf {A}^3\ell )\) when called with input \((Q,\mathbf {A}_\bullet )\).

Proof

As described in [6, Exercise 5.18], the cost of building an ear decomposition of Q is \(O(n_V+n_E)\). The for loop spanning lines 3-8 runs \(\ell \) times and, in each iteration of this loop, the computational cost is dominated by the kernel intersection in line 7. Computing this intersection requires Gaussian elimination on a matrix of size at most \(n_\mathbf {A}\times 2n_\mathbf {A}\), which costs \(O(n_\mathbf {A}^3)\). Hence we obtain the complexity bound \(O(n_V+n_E+n_\mathbf {A}^3\ell )\). \(\square \)

We now estimate the complexity of calling AcycReduce with input \((Q,\mathbf {A}_\bullet )\).

Lemma 5.8

The computational complexity of AcycReduce\((Q,\mathbf {A}_\bullet )\) is

$$\begin{aligned} O\left( (n_V^2+n_E)n_\mathbf {A}^3\right) . \end{aligned}$$

Proof

The set of maximal strongly-connected subquivers \(\mathbf{MSC} (Q)\) can be computed in time \(O(n_V+n_E)\) [6,  Section 5.2]. Enumerating its elements as \({\left\{ {Q_1, \ldots , Q_s}\right\} }\), for \(s \ge 0\), we note that the for loop spanning lines 2-5 runs s times. For each j in \({\left\{ {1,\ldots ,s}\right\} }\), we let \(n_{V,j}\) and \(n_{E,j}\) denote the number of vertices and edges of \(Q_j\), and let \(\ell _j\) be the number of ears produced when \(Q_j\) is ear-decomposed. We know from Proposition 5.7 that the call to SCReduce in line 3 incurs a cost of \(O (n_{V,j}+n_{E,j}+\ell _jn_\mathbf {A}^3)\). The call to BFSEqualise in line 5 has a worst-case burden \(O (n_En_\mathbf {A}^3)\), since we must traverse every edge of Q and perform Gaussian elimination on an augmented \(n_\mathbf {A}\times 2n_\mathbf {A}\) matrix to compute the restricted subspace (as in Definition 3.2) at its source vertex. Thus, the j-th iteration of the for loop costs

$$\begin{aligned} O\left( n_{V,j}+n_{E,j}+(n_E+\ell _j)n_\mathbf {A}^3\right) . \end{aligned}$$

Since the subquivers \(Q_j\) are mutually disjoint, we sum the above expression over j in \({\left\{ {1,\ldots ,s}\right\} }\) to obtain the total cost incurred by the for loop

$$\begin{aligned} O\left( n_V+n_E + \left( sn_E + \sum _{j=1}^s \ell _j\right) n_\mathbf {A}^3\right) . \end{aligned}$$

The conclusion follows by discarding small terms and using the fact that s and \(\sum _j\ell _j\) are bounded from above by \(n_V\) and \(n_E\), respectively. \(\square \)

In practice, the runtime of AcycReduce can be improved in the presence of parallel processing, since SCReduce may be called on the strongly-connected subquivers of Q concurrently. It remains to estimate the complexity of invoking ArbReplace on the output \((Q^*,\mathbf {A}^*_\bullet )\) of AcycReduce\((Q;\mathbf {A}_\bullet )\). We know from Definition 3.1 that the vertex set of \(Q^*\) coincides with V, though the edge set \(E^*\) may be strictly contained in E. Moreover, we have \(\dim \mathbf {A}^*_v \le n_\mathbf {A}\) for each vertex v.

Lemma 5.9

The computational complexity of ArbReplace\((Q^*,\mathbf {A}^*_\bullet )\) is \(O\left( n_Vn_En_\mathbf {A}^3\right) \).

Proof

Augmentation, topological sorting, and the construction of a spanning arborescence (from lines 1, 2 and 10, respectively) are all \(O(n_V+n_E)\) operations, so we restrict our focus to the for loop spanning lines 4–9. For each integer \(j \ge 0\), let \(V_j \subset V\) be the (possibly empty) subset of vertices which admit exactly j incoming edges in \(E^*\), and write \(n_{V_j}\) for the cardinality of \(V_j\). Thus, we have

$$\begin{aligned} n_V = \sum _{j \ge 0} n_{V_j} \quad \text { and } \quad n_E \ge \sum _{j \ge 0} j \cdot n_{V_j}, \end{aligned}$$
(10)

where the inequality follows from the fact that the sum of \(j \cdot n_{V_j}\) over \(j \ge 0\) is the cardinality of \(E^* \subset E\). The for loop runs once per vertex of V, and the cost of each iteration is dominated by the equaliser computation in line 6. In the worst case, each equaliser computation requires Gaussian elimination on a matrix with \(n_\mathbf {A}\) rows (for \(\mathbf {A}_v\)) and \(2n_{V_0}n_\mathbf {A}\) columns (for \(\Phi '_v\)). For each vertex v in \(V_j\), there are \((j-1)\) such Gaussian eliminations to perform, so executing the for loop for \(v \in V_j\) incurs a cost of \(O((j-1)~n_{V_0}~n_\mathbf {A}^3)\). Since each \(V_j\) contains \(n_{V_j}\) vertices, the total cost of processing all vertices is given by

$$\begin{aligned} O\left( \sum _{j \ge 0} n_{V_j}~(j-1)~n_{V_0}~n_\mathbf {A}^3\right) . \end{aligned}$$

From (10), we obtain \(n_V \ge n_{V_0}\) and \(n_E \ge \sum _j (j-1)~n_{V_j}\), which concludes the argument. \(\square \)

Proof of Corollary 5.6

Summing the estimates from Lemmas 5.8 and 5.9 gives a total complexity of \(O(n_V^2+n_E+n_Vn_E)n_\mathbf {A}^3)\). The term \(n_En_\mathbf {A}^3\) is dominated by \(n_Vn_En_\mathbf {A}^3\) and may be omitted. \(\square \)

6.4 Examples

We describe two instances where the space of sections \(\Gamma (Q;\mathbf {A}_\bullet )\) arises naturally. The first example is in the representation theory of finite groups.

Example 5.10

The ability to compute sections of quiver representations allows us to recover fixed spaces of group representations. Let G be a finite group and V a finite-dimensional vector space. A representation of G valued in V is a group homomorphism \(\phi : G \rightarrow \text {GL}(V)\). Consider a quiver Q with a single vertex v and one edge e(g) from v to itself for each g in G. The data of our group representation produces a representation \(\mathbf {A}_\bullet \) of Q whose vector space \(\mathbf {A}_v\) is V and whose linear maps \(\mathbf {A}_{e(g)}\) are \(\phi (g):V \rightarrow V\). As in Example 1.3, the space of sections \(\Gamma (Q;\mathbf {A}_\bullet )\) is the fixed space of \(\phi \):

$$\begin{aligned} \Gamma (Q;\mathbf {A}_\bullet ) = {\left\{ {v \in V \mid \phi (g)(v) = v \text { for all } g \in G}\right\} }. \end{aligned}$$

Every right Kan extension problem [37, Chapter X] for functors valued in the category of vector spaces can be solved by computing sections of an appropriate representation of a (possibly infinite) quiver. Our second example involves one such extension problem.

Example 5.11

Computing spaces of sections of quiver representations allows us to construct pushforwards of sheaves on posets. Given an order-preserving map \(f:X \rightarrow Y\) between finite partially ordered sets, one can construct a pair of adjoint functors

$$\begin{aligned} f_*:\mathbf{Sh}(X) \rightarrow \mathbf{Sh}(Y) \quad \text { and } \quad f^*:\mathbf{Sh}(Y) \rightarrow \mathbf{Sh}(X) \end{aligned}$$

between the categories of sheaves (valued in finite-dimensional vector spaces, with respect to the Alexandrov topology) on X and Y, see [10, Sec 5] for details. Whereas the pullback \(f^*\) admits a straightforward definition, describing the pushforward \(f_*\mathscr {S} \in \mathbf{Sh}(Y)\) of a sheaf \(\mathscr {S} \in \mathbf{Sh}(X)\) is more delicate, since it requires computing the categorical limit

$$\begin{aligned} f_*\mathscr {S}(y) = \lim _{f(x) \ge y} \mathscr {S}(x). \end{aligned}$$

Let Q be the quiver with vertex set X and edges \(x \rightarrow x'\) whenever \(x \le x'\). The sheaf \(\mathscr {S}\) induces a representation \(\mathbf {A}_\bullet \) of Q, where \(\mathbf {A}_x\) is the stalk \(\mathscr {S}(x)\) and the edge map \(\mathbf {A}_x \rightarrow \mathbf {A}_{x'}\) is the restriction map \(\mathscr {S}(x \le x')\). Given \(y \in Y\), let \(Q_{\ge y}\) be the quiver Q restricted to the vertices \({\left\{ {x \in X \mid f(x) \ge y}\right\} }\), and let \(\mathbf {A}^y_\bullet \) restrict the representation \(\mathbf {A}_\bullet \) to these vertices. The stalks of the desired pushforward coincide with the space of sections

$$\begin{aligned} f_*\mathscr {S}(y) = \Gamma \left( Q_{\ge y};\mathbf {A}^y_\bullet \right) . \end{aligned}$$

7 Principal Components via Optimisation

Here we will define principal components with respect to a quiver representation as solutions to an optimisation problem over the space of sections. To this end, let us first recall the starting point, ordinary principal component analysis (PCA).

Definition 6.1

Let \(D := {\left\{ {y_1, \ldots , y_m}\right\} }\) be a finite collection of mean-centredFootnote 2 vectors in \(\mathbb {R}^n\); the sample covariance of D is the \(n \times n\) symmetric matrix

$$\begin{aligned} S:= \frac{1}{m} \sum _{i=1}^m y_i y_i^{\mathsf {T}}, \end{aligned}$$

where \(^{\mathsf {T}}\) indicates transpose. Assuming that the top r eigenvalues \(\lambda _1> \cdots > \lambda _r\) of S are distinct, the r-th principal component \(\mathbf{PC} _r(D)\) of D is the \(\lambda _r\)-eigenspace of S.

Since the r-th principal component is a one-dimensional subspace of \(\mathbb {R}^n\), it is standard practice to represent it by any constituent nonzero vector in \(\mathbf{PC} _r(D)\). Treating the sample covariance matrix as a bilinear form on \(\mathbb {R}^n\) allows us to interpret principal components in terms of the following variance maximisation problem:

$$\begin{aligned} \max _{X} \mathrm{tr}( X^{\mathsf {T}}S X ) \text { subject to } X^{\mathsf {T}}X = \text {id}_r. \end{aligned}$$
(11)

Here \(\mathrm{tr}\) indicates trace and \(\text {id}_r\) is the \(r \times r\) identity matrix. The columns of an optimal \(n\times r\) matrix X form an orthonormal basis for the space \(\mathbf{PC} _{\le r}(D)\) spanned by the top r principal components, and solving (11) for increasing r gives the individual principal components in descending order.

7.1 Principal Components Along Quiver Representations

Consider a quiver Q and fix a representation \(\mathbf {A}_\bullet \) of Q valued in real vector spaces. Henceforth we will fix an isomorphism \(\mathbb {R}^{\dim \mathbf {A}_v} {\mathop {\longrightarrow }\limits ^{\simeq }} \mathbf {A}_v\) for each vertex v in V, which allows us to impose (once and for all) an inner product structure on each \(\mathbf {A}_v\). Writing n for the dimension of \(\text {Tot}(\mathbf {A}_\bullet )\),

$$\begin{aligned} n = \sum _{v \in V} \dim \mathbf {A}_v, \end{aligned}$$

we inherit an isomorphism \(\mathbb {R}^n {\mathop {\longrightarrow }\limits ^{\simeq }} \text {Tot}(\mathbf {A}_\bullet )\) and a concomitant inner product structure on the total space of \(\mathbf {A}_\bullet \). Making choices of ear decompositions and spanning arborescences for Q produces a map \(F:\mathbb {R}^d \rightarrow \text {Tot}(\mathbf {A}_\bullet )\), described in (8), where \(d = \dim \Gamma (Q;\mathbf {A}_\bullet )\). Expressed in terms of the chosen isomorphisms, F becomes a full-rank \(n \times d\) matrix whose image is an embedded copy of \(\Gamma (Q;\mathbf {A}_\bullet )\) inside \(\mathbb {R}^n\). We are therefore able to define principal components relative to this embedding F.

Definition 6.2

Given any mean-centred finite subset D of \(\mathbb {R}^n \simeq \text {Tot}(\mathbf {A}_\bullet )\), let S be the sample covariance (as in Definition 6.1). For each \(r \le d\), consider the optimisation problem over all \(n \times r\) matrices \(X = [x_1 ~ x_2 ~ \cdots ~ x_r]\) prescribed by

$$\begin{aligned} \max _{X} \mathrm{tr}( X^{\mathsf {T}}S X ) \quad \text { subject to } \quad \begin{array}{cc} X^{\mathsf {T}}X = \text {id}_r \text { and } x_1,\ldots ,x_r \in \Gamma (Q;\mathbf {A}_\bullet ). \end{array} \end{aligned}$$
(12)

The space of top r principal components of D along \(\mathbf {A}_\bullet \) is the subspace \(\mathbf{PC} _{\le r}(D;\mathbf {A}_\bullet )\) of \(\mathbb {R}^n\) determined by the column span

$$\begin{aligned} \mathbf{PC} _{\le r}(D;\mathbf {A}_\bullet ) = \text {span}{\left\{ {x_1, \ldots , x_r}\right\} } \end{aligned}$$

of an optimal matrix X.

It is possible to uniquely construct an optimal solution \(X_*\) to (12) by proceeding one column at a time and imposing the orthogonality of each column with respect to all of the preceding columns. The r -th principal component of D along \(\mathbf {A}_\bullet \) is the subspace \(\mathbf{PC} _r(D;\mathbf {A}_\bullet )\) spanned by the r-th column of \(X_*\). In sharp contrast to the ordinary principal components from Definition 6.1, these principal components along \(\mathbf {A}_\bullet \) need not be eigenvectors of the covariance matrix S. There are, however, two special cases where ordinary principal components coincide with their quiver-compatible avatars.

Proposition 6.3

Assume that one of the two conditions below holds:

  1. (1)

    either \(D \subset \mathbb {R}^n\) lies entirely in the subspace \(\Gamma (Q; \mathbf {A}_\bullet )\), or

  2. (2)

    the edge set of Q is empty.

Then \(\mathbf{PC}_r(D) = \mathbf{PC}_r(D;\mathbf {A}_\bullet )\) for every \(r \le d\).

Proof

If \(D \subset \Gamma = \Gamma (Q;\mathbf {A}_\bullet )\), then the sample covariance S restricts to an endomorphism of \(\Gamma \). For all \(r \le d\), the columns of any matrix X that maximises (11) must also lie in \(\Gamma \). Thus, such an X also maximises (12). Finally, if there are no edges in Q then \(\Gamma \) equals all of \(\mathbb {R}^n\) so (12) reduces to (11). \(\square \)

In its most general form, linearly constrained PCA can be described as follows. The space of top r principal components of \(D \subset \mathbb {R}^n\) (with sample covariance S), constrained by some \(n \times c\) matrix W, is the span of the columns of an optimal \(n \times r\) matrix X in

$$\begin{aligned} \max _{X} \mathrm{tr}( X^{\mathsf {T}}S X ) \quad \text { subject to } \quad \begin{array}{ll} X^{\mathsf {T}}X = \text {id}_r \text { and } W^{\mathsf {T}}X = 0 . \end{array} \end{aligned}$$

This formulation follows from [15, Equation 7.4], and it is usually assumed that \(W^{\mathsf {T}}W = \text {id}_c\). Evidently, finding principal components along a quiver representation is a special instance of constrained PCA, provided we have access to an orthogonal basis for the complement of \(\Gamma (Q;\mathbf {A}_\bullet )\) in \(\text {Tot}(\mathbf {A}_\bullet )\).

7.2 Alternate Perspectives

Here we define two more optimisation problems related to (12); as before, both will require a fixed choice of embedding \(F : \mathbb {R}^d \rightarrow \text {Tot}(\mathbf {A}_\bullet )\) of \(\Gamma (Q;\mathbf {A}_\bullet )\), where the map F is viewed as an \(n \times d\) matrix. Here is the first one, which is defined over the space of \(d \times r\) matrices Y:

$$\begin{aligned} \max _{Y} \mathrm{tr}( Y^{\mathsf {T}}F^{\mathsf {T}}S F Y) \quad \text {subject to} \quad Y^{\mathsf {T}}(F^{\mathsf {T}}F) Y = \text {id}_r. \end{aligned}$$
(13)

The \(n \times n\) matrix \(B:= FF^{\mathsf {T}}\) serves as a (not necessarily orthogonal) projection onto the image of F. Now, we set \(S_B:= BSB\) and consider another optimisation problem defined over \(n \times r\) matrices Z:

$$\begin{aligned} \max _{Z} \mathrm{tr}( Z^{\mathsf {T}}S_B Z) \quad \text {subject to} \quad Z^{\mathsf {T}}(B^2) Z = \text {id}_r. \end{aligned}$$
(14)

Although the r columns of Z can be any \(B^2\)-orthonormal vectors in \(\text {Tot}(\mathbf {A}_\bullet )\), the optimal directions will lie in \(\Gamma (Q;\mathbf {A}_\bullet )\) because \(S_B\) restricts to an endomorphism of \(\Gamma (Q;\mathbf {A}_\bullet )\) for any \(r \le d\). Our next result establishes the equivalence of these two alternate perspectives with the original one from Definition 6.2.

Proposition 6.4

The maximum values of the three optimisation problems (12), (13), and (14) are all the same. Moreover, a matrix X maximises (12) if and only if matrix Y maximises (13) if and only if matrix Z maximises (14), where

$$\begin{aligned} X = FY = BZ. \end{aligned}$$

Proof

We first show that Z maximises (14) if and only if BZ maximizes (12). Since \(\Gamma = \Gamma (Q;\mathbf {A}_\bullet )\) is the image of B, it follows that the columns of BZ all lie in \(\Gamma \). Moreover, we have \((BZ)^{\mathsf {T}}(BZ) = Z^{\mathsf {T}}(B^2) Z = \text {id}_r\), so BZ is orthogonal and satisfies all the constraints of (12). Moreover, we have

$$\begin{aligned} \mathrm{tr}( Z^{\mathsf {T}}S_B Z)&= \mathrm{tr}(Z^{\mathsf {T}}\cdot BSB \cdot Z)\\&= \mathrm{tr}\left( (BZ)^{\mathsf {T}}S (BZ) \right) . \end{aligned}$$

Conversely, given some X maximising (12), its columns \(x_i\) are orthonormal vectors in \(\Gamma \), hence \(x_i = Bz_i\) for some \(z_i \in \text {Tot}(\mathbf {A}_\bullet )\). Letting Z be the matrix of columns \(z_i\) gives a solution to (14) with the same maximal value (as confirmed by the trace calculation above). This gives the desired equivalence of (12) and (14). Turning now to (13), assume again that Z maximises (14) and let \(Y = F^{\mathsf {T}}Z\), so

$$\begin{aligned} (FY)^{\mathsf {T}}(FY) = (BZ)^{\mathsf {T}}(BZ) = \text {id}_r. \end{aligned}$$

Computing the relevant trace for (13) gives

$$\begin{aligned} \mathrm{tr}( Y^{\mathsf {T}}F^{\mathsf {T}}S F Y)&= \mathrm{tr}( Z^{\mathsf {T}}F F^{\mathsf {T}}S F F^{\mathsf {T}}Z) \\&= \mathrm{tr}( Z^{\mathsf {T}}B SB Z) \\&= \mathrm{tr}(Z^{\mathsf {T}}S_B Z). \end{aligned}$$

Thus, the value of the objective function of (14) at Z equals the value of the objective function of (13) at \(Y = F^{\mathsf {T}}Z\). Conversely, given some Y maximising (13), its image FY is a matrix of orthogonal vectors in \(\Gamma \), hence lies in the feasible set for (12), with the trace of \(X = FU = BV\) in (12) being the same as the trace of Y in (13). \(\square \)

We consider (12) an implicit version of the optimisation problem to determine principal components along quiver representations, while (13) and (14) are its parametrised and projected variants. Thanks to the preceding result, it becomes possible to freely translate between these three perspectives. In practice, the dimension d of \(\Gamma (Q;\mathbf {A}_\bullet )\) is much smaller than the ambient dimension n of \(\text {Tot}(\mathbf {A}_\bullet )\), so one might wish to work with the optimisation problem (13) in this smaller space. An algorithmic approach to (14) that similarly reduces to a smaller space has been studied in [21].

Remark 6.5

The argument invoked in the proof of Proposition 6.4 simplifies considerably if the \(n \times d\) matrix F has orthonormal columns. In this case, the matrices Y in (13) satisfy \(Y^{\mathsf {T}}Y = \text {id}_r\). Moreover, the matrix Z that maximises (14) satisfies \(Z^{\mathsf {T}}Z = \text {id}_r\). This is because \(B = FF^{\mathsf {T}}\) is an orthogonal projection onto \(\Gamma \), so \(v \in \Gamma \) if and only if \(Bv =v\). Since the columns of Z are in \(\Gamma \) at the optimum, we have \(BZ = Z\) and hence \(\text {id}_r = Z^{\mathsf {T}}B^{\mathsf {T}}B Z = Z^{\mathsf {T}}Z\), as claimed.

7.3 Examples

We conclude this section with some examples to illustrate principal components along quiver representations. As for usual principal components, they give a low-dimensional projection of the data, with interpretable coordinates, in which features may be found. We first consider a statistically motivated example.

Example 6.6

Consider the quiver representation \(\mathbb {R}^2 \leftarrow \mathbb {R}^4 \rightarrow \mathbb {R}^2\) with arrow maps

$$\begin{aligned} A = \begin{bmatrix} 1 &{}\quad 1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 1 &{}\quad 1 \end{bmatrix} \quad \text {and} \quad B = \begin{bmatrix} 1 &{} \quad 0 &{} \quad 1 &{} \quad 0 \\ 0 &{}\quad 1 &{}\quad 0 &{} \quad 1 \end{bmatrix} , \end{aligned}$$

cf. Example 1.5. The space of sections is the image of \(\mathbb {R}^4\) under the flow map, i.e. the points

$$\begin{aligned} x = \begin{bmatrix} x_{11}&x_{12}&x_{21}&x_{22}&x_{1+}&x_{2+}&x_{+1}&x_{+2} \end{bmatrix} \in \text {Tot}(\mathbf {A}_\bullet ) \simeq \mathbb {R}^8, \end{aligned}$$

where \(+\) denotes summing over an index, e.g. \(x_{1+} = x_{11} + x_{12}\). The first four coordinates \(x_{ij}\) are joint observations, and the last four coordinates \(x_{i+}\) and \(x_{+j}\) are two pairs of marginal observations. For example, if \(x_{ij}\) is a gene expression measurement for gene i in cell type j, then \(x_{i+}\) sums over the two cell types, while \(x_{+j}\) sums over the two genes.

The principal components along the quiver representation are directions in \(\mathbb {R}^8\) that maximise the variance in the data, subject to taking the form of a joint observation and its two marginals. Note that one could also consider the principal components of the joint observations in \(\mathbb {R}^4\), and take their image under the flow map F to give directions in \(\mathbb {R}^8\). These directions will in general not coincide with the principal components along the quiver representation, cf. the last paragraph of Example 7.5.

We next consider a biological setting, involving gene expression measurements.

Example 6.7

The development of methods that relate bulk and single-cell transcriptomics data is an active area of study, see e.g. [31, Figure 1]. Bulk RNA-seq gives an average gene expression across the cells in a sample. Single-cell RNA-seq gives measurements for each cell. Then, cells can be clustered to give the a gene expression value for each cell type.

Fixing g genes and c cell types, we consider the quiver representation \( \mathbb {R}^{g c} \longrightarrow \mathbb {R}^g \). The linear map on the arrow is \(I \otimes a \in \mathbb {R}^{g \times gc}\), where the c cell types are assumed to be present in proportions \(a = (a_1, \ldots , a_c) \in \mathbb {R}^{1 \times c}\). The sample data lies in \(\text {Tot}(\mathbf {A}_\bullet ) = \mathbb {R}^{g c} \times \mathbb {R}^g\). For a sample \((v, w) \in \mathbb {R}^{gc} \times \mathbb {R}^g\), the entry \(v_{ij}\) is the gene expression of gene i in cell type j, while \(w_k\) is the bulk measurement for gene k. The data will in general not lie in the space of sections, on account of the different measurement techniques, as well as variation in the proportions of cell types.

The principal components along the quiver representation are directions in \(\mathbb {R}^{gc} \times \mathbb {R}^g\) that exhibit high variance in the data while being consistent between the single-cell and bulk measurements. Given a coarser assignments of cells into types, this idea extends to the quiver representation \(\mathbb {R}^{gc} \rightarrow \mathbb {R}^{gt} \rightarrow \mathbb {R}^g\), where \(t < c\) is the number of cells types in the coarser clustering, see e.g. [51, Figures 3 and 4]. If the two assignments of cells into types are not compatible, we instead consider the quiver representation \(\mathbb {R}^{gc} \rightarrow \mathbb {R}^{g} \leftarrow \mathbb {R}^{gt}\).

8 Principal Components as Generalised Eigenvectors

We have already noted that—aside from some very special cases as in Proposition 6.3—the principal components \(\mathbf{PC} _r(D;\mathbf {A}_\bullet )\) of Definition 6.2 are not eigenvectors of the sample covariance S. Here we remedy this defect by providing a spectral interpretation for \(\mathbf{PC} _r(D;\mathbf {A}_\bullet )\). All scalars, vectors and matrices described below live over the field of real numbers.

Definition 7.1

Fix two identically sized square matrices A and B. The generalised eigenvalues of the matrix pencil \(A - \lambda B\) are the solutions \(\lambda \) to \(\det (A - \lambda B) = 0\). We call a nonzero vector x with \(Ax = \lambda Bx\) a generalised eigenvector, with \(\lambda \) its generalised eigenvalue.

Our main tool in the quest to interpret quiver principal components as generalised eigenvectors is the generalised singular value decomposition (GSVD) [49, Theorem 2].

Theorem 7.2

[GSVD] Given positive integers \(a \ge b \ge c\), fix an \((a \times c)\) matrix A and a \((b \times c)\) matrix B. There exist

  1. (1)

    orthogonal matrices \(W_A\) and \(W_B\) of size \(a \times a\) and \(b \times b\), respectively,

  2. (2)

    (rectangular) diagonal matrices \(\Delta \) and \(\Sigma \) of size \(a \times c\) and \(b \times c\), respectively, and

  3. (3)

    a \(c \times c\) invertible matrix G,

satisfying both

$$\begin{aligned} A = W_A \Delta G \qquad \text {and} \qquad B = W_B \Sigma G. \end{aligned}$$

The matrices \(W_A, W_B\) and G are not uniquely determined, but the ratios \(\delta _i^2/\sigma _i^2\) of the squares of the diagonal entries of \(\Delta \) and \(\Sigma \) are completely specified (up to reordering) by A and B. We note en passant that a different generalisation of the singular value decomposition [49, Theorem 3] also appears in the context of constrained PCA, and that a discussion of GSVD naming conventions can be found in [45, Section 5.5]. Returning to the setting of interest, we fix a representation \(\mathbf {A}_\bullet \) of a quiver Q and select a full-rank \(n \times d\) matrix \(F:\mathbb {R}^d \rightarrow \text {Tot}(\mathbf {A}_\bullet )\) whose image is \(\Gamma (Q;\mathbf {A}_\bullet )\). The following result is Theorem (B) from the Introduction.

Theorem 7.3

Let S be the sample covariance of a sufficiently generic mean-centred subset \(D = {\left\{ {y_1,\ldots ,y_m}\right\} } \subset \mathbb {R}^n\) of cardinality \(m \ge n\). For each \(r \le d\), the r-th principal component \(\mathbf{PC} _r(D;\mathbf {A}_\bullet )\) is spanned by \(Fu_r\), where \(u_r\) is the eigenvector of the matrix pencil \(F^{\mathsf {T}}S F - \lambda (F^{\mathsf {T}}F)\) corresponding to its r-th largest generalised eigenvalue.

Proof

Let M denote the \(m \times n\) matrix whose i-th row is the normalised vector \(y_i/\sqrt{m}\), so that the sample covariance satisfies \(S = M^{\mathsf {T}}M\). Noting that \(m \ge n \ge d\), we apply the GSVD from Theorem 7.2 to the \(m \times d\) matrix \(A = MF\) and the \(n \times d\) matrix \(B = F\). This produces factorisations

$$\begin{aligned} MF = W_A \Delta G \qquad \text {and} \qquad F = W_B \Sigma G \end{aligned}$$

with orthogonal \(W_A, W_B\), invertible G, and diagonal \(\Delta ,\Sigma \). Since D is generic and F has full rank, we may safely assume that the diagonal entries of \(\Delta \) and \(\Sigma \) are nonzero. And by orthogonality of both the W-matrices, we obtain two new identities

$$\begin{aligned} (MF)^{\mathsf {T}}(MF) = G^{\mathsf {T}}\Delta ^2G \qquad \text {and} \qquad F^{\mathsf {T}}F = G^{\mathsf {T}}\Sigma ^2G. \end{aligned}$$
(15)

Since \(S = M^{\mathsf {T}}M\) by design, the first identity reduces to \(F^{\mathsf {T}}S F = G^{\mathsf {T}}\Delta ^2G\). Let us write \({\left\{ {\delta _1,\ldots ,\delta _d}\right\} }\) and \({\left\{ {\sigma _1,\ldots ,\sigma _d}\right\} }\) for the (necessarily nonzero) diagonal entries of \(\Delta \) and \(\Sigma \), respectively, and denote by \(g_i\) the i-th column of \(G^{-1}\). It follows from (15) that \(g_i\) is a generalised eigenvector for the \(d \times d\) matrix pencil \((F^{\mathsf {T}}S F) - \lambda \cdot (F^{\mathsf {T}}F)\), corresponding to the generalised eigenvalue \(\lambda _i := \nicefrac {\delta _i^2}{\sigma _i^2}\). In other words, we have

$$\begin{aligned} (F^{\mathsf {T}}S F) g_i = \lambda _i \cdot (F^{\mathsf {T}}F) g_i. \end{aligned}$$
(16)

The top \(d \times d\) block \(\Sigma _d\) of \(\Sigma \) is invertible because its diagonal has nonzero entries. Since G is also invertible, the product \(\Sigma _dG\) permutes the set of \(d \times r\) matrices via \(Y \mapsto Y_\circ = \Sigma _dGY\), which allows us to re-express the optimisation (13) in a particularly convenient form. To this end, we calculate:

$$\begin{aligned} Y^{\mathsf {T}}(F^{\mathsf {T}}S F) Y&= Y^{\mathsf {T}}(G^{\mathsf {T}}\Delta ^2G) Y&\text {by}~ (15) \\&= (G^{-1}\Sigma _d^{-1}Y_\circ )^{\mathsf {T}}(G^{\mathsf {T}}\Delta ^2G) (G^{-1}\Sigma _d^{-1}Y_\circ )&\text {since }Y_\circ = \Sigma _dGY \\&= Y_\circ ^{\mathsf {T}}~\Sigma _d^{-1}\Delta ^2\Sigma _d^{-1}~Y_\circ&\text {after two cancellations}. \end{aligned}$$

Now the intermediate product \(\nabla := \Sigma _d^{-1}\Delta ^2\Sigma _d^{-1}\) is a \(d \times d\) diagonal matrix whose i-th diagonal entry is \(\lambda _i = \nicefrac {\delta ^2_i}{\sigma ^2_i}\). Reordering basis vectors if necessary, we can assume without loss of generality that \(\lambda _1> \cdots > \lambda _d\). The change of variables \(Y \mapsto Y_\circ \) transforms the optimisation problem from (13) into

$$\begin{aligned} \max _{Y_\circ } \mathrm{tr}( Y_\circ ^{\mathsf {T}}\nabla Y_\circ ) \quad \text {subject to} \quad Y_\circ ^{\mathsf {T}}Y_\circ = \text {id}_r. \end{aligned}$$

This is ordinary PCA the optimisation (11), which generically admits a unique solution \(Y_*\) obtained by successively increasing r. Since \(\nabla \) is diagonal, the i-th column of \(Y_*\) is the i-th elementary basis vector. Thus, the columns \({\left\{ {u_1, \ldots , u_r}\right\} }\) of \(U = G^{-1}\Sigma _d^{-1}Y_*\) lie in the directions of the corresponding columns of \(G^{-1}\). By (16), these columns are generalised eigenvectors associated to the r largest generalised eigenvalues of our matrix pencil. Finally, applying F to U gives the principal components along the quiver representation as in Proposition 6.4. \(\square \)

It follows that the top principal component is Fu, where u maximises the Rayleigh quotient

$$\begin{aligned} \frac{u^{\mathsf {T}}\! (F^{\mathsf {T}}\! S F) u}{u^{\mathsf {T}}\! (F^{\mathsf {T}}F)u}, \end{aligned}$$
(17)

but in general for \(r>1\) the optimisation (13) is not equivalent to a single trace ratio problem (see [40]).

Remark 7.4

Since the embedding \(F:\mathbb {R}^d \rightarrow \text {Tot}(\mathbf {A}_\bullet )\) has rank d, the \(d \times d\) matrix \(F^{\mathsf {T}}F\) is invertible. We can therefore convert the generalised eigenproblem of Theorem 7.3 into the usual eigenvector problem \(F^+ SF x = \lambda x\), where \(F^+ = (F^{\mathsf {T}}F)^{-1} F^{\mathsf {T}}\) is the pseudo-inverse. However, as explained in [21, Section 4], it is often preferable to work with the generalised eigenvalue problem as the matrix \(F^+ SF\) may not be symmetric. And depending on the condition number of \((F^{\mathsf {T}}F)^{-1}\), the numerical stability might be worse .

Example 7.5

Consider the quiver

figure j

with representation:

figure k

Writing \(n=p+q\) for the dimension of the total space, the \(n \times n\) sample covariance S of some \(D \subset \mathbb {R}^n\) and the embedding \(F:\mathbb {R}^{p} \rightarrow \mathbb {R}^n\) can be written as

$$\begin{aligned} S = \begin{bmatrix} S_{uu} &{}\quad S_{uv} \\ S_{vu} &{}\quad S_{vv} \end{bmatrix}, \qquad F = \begin{bmatrix} \text {id}_p \\ J \end{bmatrix}, \end{aligned}$$

where \(S_{vu} = S_{uv}^{\mathsf {T}}\). Theorem 7.3 shows that the principal components are given by the generalised eigenvectors of the matrix pencil \(A-\lambda B\) spanned by

$$\begin{aligned} A = S_{uu} + J^{\mathsf {T}}S_{vu} + S_{uv}J + J^{\mathsf {T}}S_{vv}J, \qquad B = \text {id}_p + J^{\mathsf {T}}J . \end{aligned}$$

In the special case where D lies in the image of F, we have

$$\begin{aligned} S = \begin{bmatrix} \text {id}_p \\ J \end{bmatrix} S_{uu} \begin{bmatrix} \text {id}_p&J^{\mathsf {T}}\end{bmatrix} = \begin{bmatrix} S_{uu} &{} \quad S_{uu} J^{\mathsf {T}}\\ J S_{uu} &{}\quad J S_{uu} J^{\mathsf {T}}\end{bmatrix}, \end{aligned}$$

and the matrix pencil is spanned by

$$\begin{aligned} A = S_{uu} + J^{\mathsf {T}}J S_{uu} + S_{uu} J^{\mathsf {T}}J + J^{\mathsf {T}}J S_{uu} J^{\mathsf {T}}J, \qquad B = \text {id}_p + J^{\mathsf {T}}J. \end{aligned}$$

If, in addition, \(J^{\mathsf {T}}J\) equals \(\eta \text {id}_p\) for some scalar \(\eta \), then this specialises further to give the matrix pencil spanned by \(A = (1 + 2 \eta + \eta ^2)S_{uu}\) and \(B = (1 + \eta )\text {id}_p\). Now the principal components along the quiver representation are given by \(F\xi \), where \(\xi \) are the usual principal components of D restricted to the vector space \(\mathbb {R}^p\) on the first vertex of the quiver.

9 Learning Quiver Representations

We conclude this paper with a discussion focused on the problem of learning quiver representations from observed data. Fix a quiver \(Q = (s,t:E \rightarrow V)\), and assume that we have full knowledge of the real vector spaces \({\left\{ {\mathbf {A}_v \mid v \in V}\right\} }\) assigned by some Q-representation \(\mathbf {A}_\bullet \) to all the vertices. However, none of the linear maps \(\mathbf {A}_e : \mathbf {A}_{s(e)} \rightarrow \mathbf {A}_{t(e)}\) are known. Instead, we are given access to mean-centred data \({\left\{ {y_1, \ldots , y_m}\right\} }\), where each \(y_i\) is a vector in the total space \(\text {Tot}(\mathbf {A}_\bullet ) \simeq \mathbb {R}^n\). Our task is to determine the \(\mathbf {A}_e\) maps that best fit the available data; here we will show how in special cases this task reduces to well-studied problems. It will be convenient to define, for each vertex v, the \(m \times \dim \mathbf {A}_v\) matrix \(Y_v\) whose i-th row is the part of \(y_i\) that lies in \(\mathbf {A}_v\).

Example 8.1

Consider the quiver

figure l

with representation

figure m

with matrix \(\mathbf {A}_e\) unknown. Given data \(y_i = (y_{i,u}, y_{i,v}) \in \mathbf {A}_u \times \mathbf {A}_v\) for \(i \in {\left\{ {1, \ldots , m}\right\} }\), minimising the Euclidean distance between \(y_{i,v}\) and \(\mathbf {A}_e y_{i,u}\) for each i gives the least squares optimisation problem

$$\begin{aligned} \min _{\mathbf {A}_e} \Vert Y_v - Y_u \mathbf {A}_e^{\mathsf {T}}\Vert . \end{aligned}$$

Thus, the optimal estimate for \(\mathbf {A}_e^{\mathsf {T}}\) is \((Y_u)^+ Y_v\), where \((Y_u)^+\) indicates the Moore-Penrose inverse of \(Y_u\).

The preceding example can equivalently be viewed as training (or, learning the \(\dim (\mathbf {A}_u) \times \dim (\mathbf {A}_v)\) parameters in) a linear neural network with full bipartite connections between a single input and output layer:

figure n

The principal components along the quiver representation are then pairs of points in the input space \(\mathbf {A}_u\) and the output space \(\mathbf {A}_v\) that fit the weights on the edges and along which high variance is seen in the data.

Remark 8.2

More generally, a linear neural network with k layers corresponds to learning a quiver representation on the quiver with k edges

Each vertex v is replaced by \(\dim (\mathbf {A}_v)\) scalar nodes, with full bipartite connections between nodes in adjacent layers. A more general architecture could involve other quivers. For example, loops arise from lateral interactions [5, Figure 5]. The setting of learning parameters in linear neural networks with two layers is itself closely connected to principal component analysis [4].

One way to extend the above to more general quivers is to learn the map on each edge e independently, which amounts to minimising the objective function:

$$\begin{aligned} \sum _{e \in E} \left( \Vert Y_{t(e)} - Y_{s(e)} \mathbf {A}_e^{\mathsf {T}}\Vert ^2 \right) . \end{aligned}$$
(18)

Now the estimate for each edge map \(\mathbf {A}_e\) is given by Example 8.1. If the quiver Q is an arborescence, then the optimisation (18) falls into the setting of a Gaussian graphical model [36, 44] associated to a certain directed acyclic graph with \(\dim \text {Tot}(\mathbf {A}_\bullet )\) vertices, as we now describe.

Definition 8.3

Let \(\delta :V \rightarrow \mathbb {Z}_{\ge 0}\) be a function from the vertices of an arborescence Q to the non-negative integers. The \(\delta \)-blowup of Q is the quiver \(Q_\delta \) where each \(v \in V\) is replaced by \(\delta (v)\) vertices, and each edge \(e \in E\) is replaced by a complete directed bipartite graph whose edges go from the \(\delta (s(e))\) vertices replacing s(e) to the \(\delta (t(e))\) vertices replacing t(e).

The directed acyclic graph of interest to us here is the \(\delta \)-blowup of the arborescence Q where \(\delta (v) = \dim \mathbf {A}_v\). We denote this blowup by \(Q_{\dim (\mathbf {A}_\bullet )}\). For instance, if Q is the arborescence on the left and \(\mathbf {A}_\bullet \) is the representation (known only on the vertices) depicted in the middle, then the blowup \(Q_{\dim (\mathbf {A}_\bullet )}\) is shown to the right.

figure o

The entries of the unknown matrices \(\mathbf {A}_e\) become unknown scalar weights on the edges of \(Q_{\dim (\mathbf {A}_\bullet )}\). Maximum likelihood estimation in the Gaussian graphical model learns the weights on these edges by minimising least squares error. Since this is equivalent to (18), it gives an identical estimate for the unknown maps in the quiver representation.

Although Definition 8.3 extends verbatim to the case where Q is not an arborescence, the maximal likelihood estimation strategy described above is restricted to the setting of an arborescence. This is because weights of incoming edges at a vertex of the directed acyclic graph are summed over in a graphical model [44, Equation (13.2.3)]. By comparison, in the quiver setting we do not sum incoming edges from different vertices of the quiver in (18). Thus the above strategy only works when each vertex in the quiver has at most one incoming edge.

The local assumption governing the choice of objective function in (18) is that the maps \(\mathbf {A}_e\) can be learned independently of one another; this does not take into account the goodness of fit of data along longer paths in the quiver. Given such a path p, one may wish to minimise the distance between \(y_{i,t(p)}\) and \(\mathbf {A}_p y_{i,s(p)}\). This yields an immediate generalisation of the objective function in (18), where one sums the contributions of each path in Q, rather than just over each edge. For acyclic quivers, such an optimisation can be approached by using a suitable partial order on edges, but it is more complicated for quivers with cycles. We defer a more general study of learning maps in quiver representations to future work.

Our final example is an illustration of finding principal components along a learned quiver representation. This combines parameter estimation with principal component analysis, as is also seen in [38, 47, 50].

Example 8.4

Consider once again the quiver with one edge \(e: u \rightarrow v\) and representation \(\mathbb {R}^p \rightarrow \mathbb {R}^q\) with unknown \(\mathbf {A}_e\). The best estimate is given by \((Y_u^+ Y_v)^{\mathsf {T}}\), as described in Example 8.1. Thus, a parameterisation of the space of sections \(\Gamma (Q;\mathbf {A}_\bullet )\) is given by

$$\begin{aligned} F = \begin{bmatrix} I \\ (Y_u^+ Y_v)^{\mathsf {T}}\end{bmatrix}. \end{aligned}$$

The top principal component along the quiver representation is the direction in the image of F along which there is maximum variance in the data. This can be computed using Theorem 7.3 via the matrix pencil from Example 7.5, provided that we set \(J = (Y_u^+Y_v)^{\mathsf {T}}\).