1 Introduction

Given a pair of matrices \(E, F\in \mathbb {C}^{d\times d}\), the associated matrix pencil is defined by

$$\begin{aligned} P(s):= sE-F. \end{aligned}$$
(1.1)

The theory of matrix pencils occupies an increasingly important place in linear algebra, due to its numerous applications. For instance, they appear in a natural way in the study of differential-algebraic equations of the form:

$$\begin{aligned} E{\dot{x}}=Fx, \ \ \ x(0)=x_0, \end{aligned}$$
(1.2)

which are a generalization of the abstract Cauchy problem, see e.g. [20, Chapter 12, §7]. Substituting \(x(t)=x_0e^{st}\) into (1.2) leads to

$$\begin{aligned} (sE - F)x_0 = 0. \end{aligned}$$

Hence, solutions of the above eigenvalue equation for the matrix pencil (1.1) correspond to solutions of the Cauchy problem (1.2).

The matrix pencil P is called regular if \(\det (sE - F)\) is not identically zero, and it is called singular otherwise. Perturbation theory for regular matrix pencils \(P(s):= sE-F\) is a well developed field, we mention here only [14, 21, 36, 45] which is a short list of papers devoted to this subject. As an example, we describe a well-known result. Recall that for a matrix pencil P as in (1.1), an ordered family of vectors \((x_n, \ldots ,x_{0})\) is a Jordan chain of length \(n+1\) at \(\lambda \in {\mathbb {C}}\) if \(x_{0}\ne 0\) and

$$\begin{aligned} (F -\lambda E)x_0=0,\quad (F-\lambda E)x_1 = E x_0,\quad \ldots , \quad (F-\lambda E)x_n = E x_{n-1}. \end{aligned}$$

Denote by \({\mathcal {L}}_{\lambda }^l(P)\) the subspace spanned by the elements of all Jordan chains up to length l at the eigenvalue \(\lambda \in {\mathbb {C}}\). If \(l=0\) or if \(\lambda \) is not an eigenvalue of P we define \({\mathcal {L}}_{\lambda }^l(P)=\{0\}\). If P(s) is regular and if Q(s) is a rank-one pencil such that \((P+Q)(s)\) is also regular then for \(n\in {\mathbb {N}}\cup \{0\}\) the following inequality holds:

$$\begin{aligned} \left| \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P+Q)}{\mathcal {L}_{\lambda }^n(P+Q)} -\dim \frac{\mathcal {L}_{\lambda }^{n+1}(P)}{\mathcal {L}_{\lambda }^{n}(P)}\right| \le 1. \end{aligned}$$

In this form it can be found in [21], but it is mainly due to [14, 45]. The proof of this inequality, as many other results concerning perturbation theory for regular matrix pencils, is based on a detailed analysis of the determinant.

Perturbation theory for singular matrix pencils is studied only in a few papers so far. Roughly speaking, it started with the investigation of the Kronecker canonical form of a fixed singular matrix pencil P under low rank perturbations in [13]. There, the generic change in the Kronecker canonical form of a singular pencil under low-rank perturbations resulting again in a singular pencil is considered. In this case the term generic refers to the fact that the perturbations are from an open dense subset of the set of pencils with fixed sizes and rank, cf. [13, Theorem 3.1]. In [29, 38] the effect of generic regularizing perturbations was considered, i.e. perturbations whose rank is exactly the difference of full rank and the rank of a singular pencil. While the focus in [38] is on symmetric rank-one perturbations, [29] contains the general low-rank case. In [37] the rank-one distance to singularity as the smallest norm of a rank-one perturbation that makes a given pencil singular is expressed as a quadratic constrained optimization problem.

Finally, we would like to mention that in a recent manuscript [4] the authors characterize the Kronecker structure of a matrix pencil obtained by a rank-one perturbation of another matrix pencil in terms of the homogenous invariant factors and the row and column minimal indices of the original and the perturbed pencil via transforming it in a matrix pencil completion problem.

Here we develop a different approach to treat finite rank perturbations of singular matrix pencils. This is done by representing matrix pencils via linear relations, see also [6, 7, 11]. The classical philosophy to treat linear multi-valued mappings or relations was just to concentrate on the operator part and getting rid of the multi-valued part by projection. At this place one has to mention the particular contributions of Henk de Snoo to linear relations, who started, together with many coauthors, a seminal work on this subject. The publications [17,18,19] are among the first where the authors treated linear relations as subspaces in product spaces. Later on, Henk de Snoo was involved in investigations where linear relations arise in a natural way in extension and perturbation theory [16, 26,27,28] for many kinds of linear operators or relations, see also [33, 34]. Concerning his contributions to the structure of linear relations, see [40,41,42]. Of course, this is a non-exhaustive list of Henk de Snoo’s publications about this topic.

Each matrix \(E\in {\mathbb {C}}^{d\times d}\) is considered as a linear relation via its graph, i.e. the subspace of \({\mathbb {C}}^d\times {\mathbb {C}}^d\) consisting of pairs of the form \(\{x, Ex\}\), \(x\in {\mathbb {C}}^d\). Also, the inverse \(E^{-1}\) (in the sense of linear relations) of a non-necessarily invertible matrix E is the subspace of \({\mathbb {C}}^d\times {\mathbb {C}}^d\) consisting of pairs of the form \(\{Ex, x\}\), \(x\in {\mathbb {C}}^d\). Multiplication of linear relations is defined in analogy to multiplication of matrices, see Sect. 2 for the details. Then, to a matrix pencil \(P(s)=sE-F\) we associate the linear relation \(E^{-1}F\).

There exists a well developed spectral theory for linear relations, see e.g. [1, 12, 41]. An eigenvector at \(\lambda \in {\mathbb {C}}\) of \(E^{-1}F\) is a tuple of the form \(\{x,\lambda x\} \in E^{-1}F\), \(x\ne 0\). Jordan chains are defined in a similar way, see Sect. 3 below.

In Sect. 7 we show that (point) spectrum and Jordan chains of \(E^{-1}F\) coincide with (point) spectrum and Jordan chains of the matrix pencil P in (1.1), respectively. This is the key to translate spectral properties of a matrix pencil to its associated linear relation and vice versa. The advantage of this approach is that it is applicable not only to regular matrix pencils, but also to singular matrix pencils.

Given a matrix pencil P as in (1.1), we consider one-dimensional perturbations of the form

$$\begin{aligned} Q(s)=w(su^*-v^*), \end{aligned}$$

where \(u,v,w\in {\mathbb {C}}^d\), \((u,v) \ne (0,0)\) and \(w\ne 0\). Then P and \(P+Q\) are rank-one perturbations of each other, which means that they differ by a rank-one matrix polynomial. Recall that the rank of a matrix pencil P is the largest \(r\in \mathbb {N}\) such that P, viewed as a matrix with polynomial entries, has minors of size r that are not identically zero [14, 20]. As described above, to the matrix pencils P and \(P+Q\) there correspond the linear relations \(E^{-1}F\) and \(\left( E+wu^*\right) ^{-1}(F+wv^*)\), respectively, which turn out to be one-dimensional perturbations of each other, see Sect. 4. Then, the main result of this paper is Theorem 7.8 below. It consists of the following perturbation estimates for singular (and regular) matrix pencils:

  1. (i)

    If P is regular but \(P+Q\) is singular, then

    $$\begin{aligned} -1-n \le \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P+Q)}{\mathcal {L}_{\lambda }^n(P+Q)}- \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P)}{\mathcal {L}_{\lambda }^{n}(P)} \le 1. \end{aligned}$$
  2. (ii)

    If P is singular and \(P+Q\) is regular, then

    $$\begin{aligned} -1 \le \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P+Q)}{\mathcal {L}_{\lambda }^n(P+Q)}- \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P)}{\mathcal {L}_{\lambda }^{n}(P)} \le n+1. \end{aligned}$$
  3. (iii)

    If both P and \(P+Q\) are singular, then

    $$\begin{aligned} \left| \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P+Q)}{\mathcal {L}_{\lambda }^n(P+Q)}- \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P)}{\mathcal {L}_{\lambda }^{n}(P)}\right| \le n+1. \end{aligned}$$

Later, in Sect. 8, we explain how to interpret this result in terms of the Kronecker invariants associated to the Kronecker canonical forms of the matrix pencils P and \(P+Q\).

Theorem 7.8 follows from the corresponding result for one-dimensional perturbations of linear relations, which is the second main result of this paper. It is the content of Sects. 3 and 4, which is of independent interest. More precisely, given linear relations A and B in a linear space X which are one-dimensional perturbations of each other, we show that \(N(A^{n+1})/N(A^n)\) is finite-dimensional if and only if \(N(B^{n+1})/N(B^n)\) is finite-dimensional and, in this case,

$$\begin{aligned} \left| \dim \frac{N(B^{n+1})}{N(B^n)} - \dim \frac{N(A^{n+1})}{N(A^n)}\right| \,\le \,n+1. \end{aligned}$$
(1.3)

Here N(A) denotes the kernel of the linear relation A, that is, the set of all \(x\in X\) such that \(\{x,0\}\in A\). If, in addition, \(A\subset B\) or \(B\subset A\), we show that the left-hand side in (1.3) is bounded by n. However, in Sect. 5 we show that the bound in (1.3) is sharp. It is worth mentioning that if A and B are linear operators in X the left-hand side in (1.3) is bounded by 1, see [5].

In Sect. 6 we extend the above result to p-dimensional perturbations. In this case, we show that the left-hand side in (1.3) is bounded by \((n+1)p\). Again, this estimate improves to np in case that \(A\subset B\) or \(B\subset A\), and to p if A and B are operators, cf. [5].

2 Preliminaries

Throughout this paper X denotes a vector space over \({\mathbb {K}}\), where \({\mathbb {K}}\) stands for the real field \({\mathbb {R}}\) or the complex field \({\mathbb {C}}\). Sometimes \(\overline{\mathbb {C}}\) is used as a short form of \(\mathbb C \cup \{\infty \}\). Each subspace W of X determines an equivalence relation in X, we say that \(x\in X\) is congruent to \(y\in X\) if \(x-y\in W\). Then, we denote by X/W or \(\tfrac{X}{W}\) the set of all equivalence classes of X with respect to this equivalence relation. X/W is also a vector space over \({\mathbb {K}}\), which is called the quotient space of X over W, see e.g. [39].

Elements (pairs) from \(X\times X\) will be denoted by \(\{x,y\}\), where \(x,y\in X\). A linear relation in X is a linear subspace of \(X\times X\). Linear operators can be treated as linear relations via their graphs: each linear operator \(T:D(T)\rightarrow X\) in X, where \(D(T)\) stands for the domain of T, is identified with its graph

$$\begin{aligned} \Gamma (T):=\left\{ \{x,Tx\}:\ x\in D(T)\right\} . \end{aligned}$$

For the basic notions and properties of linear relations we refer to [1, 12, 25]. However, we follow here the above mentioned approach proposed in [17,18,19].

We denote the domain and the range of a linear relation A in X by \(D(A)\) and \(R(A)\), respectively,

$$\begin{aligned} D(A) = \left\{ x\in X \;:\; \exists \, y:\ \{x,y\} \in A \right\} \quad \text{ and } \quad R(A)=\left\{ y\in X \;:\; \exists \, x:\ \{x,y\} \in A \right\} . \end{aligned}$$

Furthermore, N(A) and \(M(A)\) denote the kernel and the multivalued part of A,

$$\begin{aligned} N(A)=\left\{ x\in X \;:\; \{x,0\} \in A \right\} \quad \text{ and } \quad M(A)=\left\{ y\in X \;:\; \{0,y\}\in A \right\} . \end{aligned}$$

Obviously, a linear relation A is the graph of an operator if and only if \(M(A)=\{0\}\). The inverse \(A^{-1}\) of a linear relation A always exists and is given by

$$\begin{aligned} A^{-1} =\left\{ \{y,x\}\in X \times X \;:\; \{x,y\} \in A \right\} . \end{aligned}$$
(2.1)

We recall that the product of two linear relations A and B in X is defined as

$$\begin{aligned} AB = \left\{ \{x, z\}\;:\; \{ y, z\} \in A \text{ and } \{ x,y\} \in B \; \text{ for } \text{ some } y \in X\right\} . \end{aligned}$$

As for operators the product of linear relations is an associative operation. We denote \(A^0 := I\), where I denotes the identity operator in X, and for \(n=1,2,\ldots \) the n-th power of A is defined recursively by

$$\begin{aligned} A^n := AA^{n-1}. \end{aligned}$$

Thus, we have \(\{x_n, x_0\}\in A^n\) if and only if there exist \(x_1,\ldots ,x_{n-1}\in X\) such that

$$\begin{aligned} \{x_n, x_{n-1}\}, \{x_{n-1}, x_{n-2}\}, \ldots , \{x_1, x_0\}\in A. \end{aligned}$$
(2.2)

In this case, (2.2) is called a chain of A. We also use the shorter notation \((x_n,\ldots ,x_0)\).

For a linear relation T in X and \(m\in {\mathbb {N}}\), consider the vector space of m-tuples of elements in T:

$$\begin{aligned} T^{(m)} := \underbrace{T\times T\times \dots \times T}_{m\,\mathrm{times}}\ , \end{aligned}$$

and also the space of m-tuples of elements in T which are chains of T:

$$\begin{aligned} {\mathcal {S}}_m^T := \big \{\left( \{x_{m},x_{m-1}\},\ldots ,\{x_1,x_0\}\right) : (x_{m},x_{m-1},\ldots ,x_0)\hbox { is a chain of}\ T\big \}. \end{aligned}$$
(2.3)

Clearly, \({\mathcal {S}}_m^T\) is a subspace of \(T^{(m)}\).

Lemma 2.1

Let A and C be linear relations in X such that \(C\subset A\) and \(\dim (A/C) = 1\). Then for each \(m\in {\mathbb {N}}\) the following inequality holds:

$$\begin{aligned} \dim ({\mathcal {S}}_m^A/{\mathcal {S}}_m^C)\,\le \,m. \end{aligned}$$
(2.4)

Proof

We make use of Lemma 2.2 in [3] which states that whenever \(M_0,N_0,M_1,N_1\) are subspaces of a linear space \({\mathcal {X}}\) such that \(M_0\subset M_1\) and \(N_0\subset N_1\), then

$$\begin{aligned} \dim \frac{M_1\cap N_1}{M_0\cap N_0}\,\le \,\dim \frac{M_1}{M_0} + \dim \frac{N_1}{N_0}. \end{aligned}$$

With this lemma the proof of (2.4) is straightforward. Indeed, since \({\mathcal {S}}_m^C = {\mathcal {S}}_m^A\cap C^{(m)}\), we obtain from the lemma and from \(\dim (A/C) = 1\) that

$$\begin{aligned} \dim ({\mathcal {S}}_m^A/{\mathcal {S}}_m^C) = \dim \,\frac{{\mathcal {S}}_m^A\cap A^{(m)}}{{\mathcal {S}}_m^A\cap C^{(m)}}\,\le \,\dim (A^{(m)}/C^{(m)}) = m, \end{aligned}$$

which is (2.4). \(\square \)

For relations A and B in X the operator-like sum \(A+B\) is the relation defined by

$$\begin{aligned} A+B = \left\{ \{ x,y+z\} \;:\; \{x,y\} \in A, \{x,z\} \in B \right\} . \end{aligned}$$

The notions of eigenvalue, root manifolds and point spectrum also apply to linear relations. Given \(\lambda \in {\mathbb {C}}\), \(A-\lambda \) stands for the linear relation \(A-\lambda I\):

$$\begin{aligned} A-\lambda = \left\{ \{x,y-\lambda x\} \ :\ \{x,y\}\in A \right\} . \end{aligned}$$

Then, \(\lambda \in {\mathbb {C}}\) is an eigenvalue of A if \(N(A-\lambda )\ne \{0\}\). On the other hand, we say that A has an eigenvalue at \(\infty \) if \(M(A)\ne \{0\}\). The point spectrum of A is the set \(\sigma _p(A)\) consisting of the eigenvalues \(\lambda \in {\mathbb {C}}\cup \{\infty \}\) of A.

A chain \((x_n,\ldots ,x_0)\) of A is called a quasi-Jordan chain of A at zero (or simply a quasi-Jordan chain of A if \(x_0\in N(A)\). If \((x_n,\ldots ,x_0)\) is a quasi-Jordan chain of A, then \(x_j \in N(A^{j+1})\) for \(j=0, \ldots , n\). If, in addition, \(x_n\in M(A)\) and \((x_n,\ldots ,x_0)\ne (0,\ldots ,0)\), then the chain is called a singular chain of A. The tuple \((x_n,\ldots ,x_0)\) is called a quasi-Jordan chain of A at \(\lambda \in {\mathbb {C}}\), if \((x_n,\ldots ,x_0)\) is a quasi-Jordan chain of the linear relation \(A-\lambda \). The tuple \((x_n,\ldots ,x_0)\) is called a quasi-Jordan chain of A at \(\infty \), if \((x_n,\ldots ,x_0)\) is a quasi-Jordan chain at zero of \(A^{-1}\). Note that we admit linear dependence (and even zeros) within the elements of a quasi-Jordan chain.

We reserve the notion of a Jordan chain of a linear relation for a particular situation which is discussed in the next section.

3 Linear Independence of Quasi-Jordan Chains

In what follows only quasi-Jordan chains at zero are considered, so we call them simply quasi-Jordan chains. Assume that T is a linear operator in X and consider \(x_0,\ldots ,x_n\in D(T)\) such that

$$\begin{aligned} Tx_0 =0 \quad \text{ and } \quad Tx_j=x_{j-1}, \text{ for } \text{ all } 1 \le j\le n. \end{aligned}$$

Then \(\{x_n, x_{n-1}\}, \{x_{n-1}, x_{n-2}\}, \ldots , \{x_0, 0\}\in \Gamma (T)\). So, if we consider T also as a linear relation via its graph, \((x_n,\ldots ,x_0)\) is a quasi-Jordan chain of T.

As T is a linear operator, it is well-known that the following facts are equivalent:

  1. (i)

    \(x_0 \ne 0\).

  2. (ii)

    The set of vectors \(\{x_n,\ldots ,x_0\}\) is linearly independent in X.

  3. (iii)

    \([x_n]\ne 0\), where \([x_n]\) is the equivalence class in \(N(T^{n+1})/N(T^n)\).

  4. (iv)

    \([x_j]\ne 0\) for all \(1\le j \le n\), where \([x_j]\) is the equivalence class in \(N(T^{j+1})/N(T^j)\).

Therefore, if T is a linear operator and \(x_0\ne 0\), \((x_n,\ldots ,x_0)\) is a quasi-Jordan chain of the linear relation \(\Gamma (T)\) if and only if it is a Jordan chain at zero of the linear operator T in the usual sense.

However, the four statements above are no longer equivalent for linear relations which contain singular chains, see the following example.

Example 3.1

Let \(x_0\) and \(x_1\) be two linearly independent elements of X and let

$$\begin{aligned} A:= \text{ span }\,\left\{ \{0,x_0\}, \{x_0,0\}, \{x_1,x_0\}\right\} . \end{aligned}$$

Then \(x_0\ne 0\) but \((0,x_0)\) is a quasi-Jordan chain with linear dependent entries, hence the equivalence of (i) and (ii) from above does not hold.

Moreover, \((x_1,x_0)\) is a quasi-Jordan chain with linearly independent entries. But, as \(\{x_1,x_0\}\) and \(\{0,x_0\}\) are both elements of A, due to linearity, also \(\{x_1,0\}\) is an element of A and, hence, \([x_1]=0\) in \(N(A^2)/N(A)\), i.e. (iii) is not satisfied. Therefore, conditions (ii) and (iii) are neither equivalent for linear relations.

As it was mentioned before, the situation shown in the example is a consequence of the existence of singular chains in the relation A, or equivalently, the presence of vectors in the intersection of the kernel of A and the multivalued part of \(A^n\) for some \(n\in {\mathbb {N}}\). For arbitrary linear relations we have the following equivalence.

Proposition 3.2

Let A be a linear relation in X and \((x_n,\ldots ,x_0)\) be a quasi-Jordan chain of A. Then the following statements are equivalent:

  1. (i)

    \(x_0 \notin M(A^n)\).

  2. (ii)

    \([x_n]\ne 0\), where \([x_n]\) is the equivalence class in \(N(A^{n+1})/N(A^n)\).

  3. (iii)

    \([x_j]\ne 0\) for all \(1\le j \le n\), where \([x_j]\) is the equivalence class in \(N(A^{j+1})/N(A^j)\).

In particular, if any of the three equivalent statements holds, then the vectors \(x_0, \ldots , x_n\) are linearly independent in X.

Proof

Since \((x_n,\ldots ,x_0)\) is a quasi-Jordan chain of A, we have that

$$\begin{aligned} \{x_n, x_{n-1}\},\ldots ,\{x_1,x_0\},\{x_0,0\}\in A. \end{aligned}$$
(3.1)

We show that (i) and (ii) are equivalent. If \(x_0\in M(A^n)\), then there exist \(y_1,\ldots , y_{n-1}\in X\) such that

$$\begin{aligned} \{0, y_{n-1}\}, \ldots , \{y_2,y_1\}, \{y_1, x_0\}\in A. \end{aligned}$$

Subtracting this chain from the one in (3.1) we end with

$$\begin{aligned} \{x_n, x_{n-1}-y_{n-1}\}, \ldots , \{x_2-y_2, x_1-y_1\}, \{x_1-y_1, 0\} \in A. \end{aligned}$$

Thus, \(x_n\in N(A^n)\), or equivalently, \([x_n]=0\). Conversely, if \([x_n]=0\) then \(x_n\in N(A^n)\). Hence, there exist \(u_1,\ldots , u_{n-1}\in X\) such that

$$\begin{aligned} \{x_n, u_{n-1}\}, \ldots , \{u_2,u_1\}, \{u_1,0\}\in A. \end{aligned}$$

Taking the difference of (3.1) and the chain above we obtain

$$\begin{aligned} \{0, x_{n-1}-u_{n-1}\}, \ldots , \{x_2-u_2,x_1-u_1\}, \{x_1-u_1, x_0\}\in A, \end{aligned}$$

i.e. \(x_0\in M(A^n)\).

Now we show that (ii) and (iii) are equivalent. Obviously (iii) implies (ii). Hence, assume \([x_n]\ne 0\). Then, by (i), \(x_0 \notin M(A^n)\). But as \(M(A^j) \subset M(A^n)\) for all \(1\le j \le n\), we have \(x_0\notin M(A^j)\) for all \(1\le j \le n\). Applying (ii) to every \([x_j]\) we obtain (iii).

It remains to show the additional statement concerning the linear independence of the vectors \(x_0, \ldots , x_n\). This is the case if the equation \(\sum _{j=0}^n\alpha _jx_j=0\) implies that all \(\alpha _j\), \(j=0, \ldots , n\), are equal to 0. By (iii) we see that all \(x_j\) are non-zero. If not all \(\alpha _j\) are equal to 0, let \(n_0\) be the largest index j with \(\alpha _{j} \ne 0\). It follows that

$$\begin{aligned} x_{n_0} = - \alpha _{n_0}^{-1} \sum _{j=0}^{n_0 -1}\alpha _jx_j \in N(A^{n_0}), \end{aligned}$$

hence \([x_{n_0}]=0\), in contradiction to (iii). \(\square \)

The above considerations lead to the following definition of a Jordan chain for a linear relation.

Definition 3.3

Let \((x_{n},\ldots ,x_{0})\) be a quasi-Jordan chain of a linear relation A in X. We call it a Jordan chain at zero of length \(n+1\) in A if

$$\begin{aligned} {}[x_n] \ne 0 \text{ in } N(A^{n+1})/N(A^n). \end{aligned}$$

Moreover, \((x_{n},\ldots ,x_{0})\) is called a Jordan chain at \(\lambda \in {\mathbb {C}}\) of length \(n+1\) in A if it is a Jordan chain at zero of \(A-\lambda \) and a Jordan chain at \(\infty \) of length \(n+1\) in A if it is a Jordan chain at zero of \(A^{-1}\).

We remark that our Definition 3.3 is equivalent to the definition formulated in [41] but different from the one used in [11], where the term Jordan chain was used for an object which is here called quasi-Jordan chain together with the assumption that all elements of the quasi-Jordan chain are linearly independent.

In the sequel we will make use of the following lemma.

Lemma 3.4

Let A be a linear relation in X and let \((x_{k,n},\ldots ,x_{k,0})\), \(k=1,\ldots ,m\), be m quasi-Jordan chains of A. Then

$$\begin{aligned} \dim {\text {span}}\{[x_{1,n}],\ldots ,[x_{m,n}]\} = \dim \frac{{\mathcal {L}}}{{\mathcal {L}}\cap M(A^n)}, \end{aligned}$$

where \({\mathcal {L}}:= {\text {span}}\{x_{1,0},\ldots ,x_{m,0}\}\).

Proof

Given m quasi-Jordan chains of A as in the statement, consider the following linear transformations

$$\begin{aligned} T : {\mathbb {K}}^m\rightarrow \frac{N(A^{n+1})}{N(A^n)},\qquad&Tu := \sum _{k=1}^mu_k[x_{k,n}],\quad u=(u_1,\ldots ,u_m)\in {\mathbb {K}}^m, \ \ \text {and}\\ S : {\mathbb {K}}^m\rightarrow N(A),\qquad&Su := \sum _{k=1}^mu_kx_{k,0},\quad u=(u_1,\ldots ,u_m)\in {\mathbb {K}}^m. \end{aligned}$$

On one hand, observe that \(R(T)={\text {span}}\{[x_{1,n}],\ldots ,[x_{m,n}]\}\) and \(R(S)={\mathcal {L}}\). On the other hand, we have that

$$\begin{aligned} N(T) = \left\{ u\in {\mathbb {K}}^m \;:\; Su \in M(A^n) \right\} \end{aligned}$$

Indeed, \(u=(u_1,\ldots ,u_m)\in N(T)\) if and only if \(\left[ \sum _{k=1}^m u_k x_{k,n}\right] = 0\) which, by Proposition 3.2, is equivalent to \(Su=\sum _{k=1}^mu_kx_{k,0}\in M(A^n)\).

In particular,

$$\begin{aligned} \dim N(T)&= \dim \left\{ u\in {\mathbb {K}}^m \;:\; Su \in M(A^n) \right\} = \dim N(S) + \dim {\mathcal {L}}\cap M(A^n), \end{aligned}$$

and the rank-nullity theorem yields

$$\begin{aligned} \dim {\text {span}}\{[x_{1,n}],\ldots ,[x_{m,n}]\}&= \dim R(T)= m- \dim N(T) \\&= m-(\dim N(S) + \dim {\mathcal {L}}\cap M(A^n)) \\&= \dim {\mathcal {L}}- \dim {\mathcal {L}}\cap M(A^n)= \dim \frac{{\mathcal {L}}}{{\mathcal {L}}\cap M(A^n)}, \end{aligned}$$

where we have used that \(R(S)={\mathcal {L}}\). \(\square \)

In the following we will study linear independence of quasi-Jordan chains.

Lemma 3.5

Let \((x_{k,n},\ldots ,x_{k,0})\), \(k=1,\ldots ,m\), be m quasi-Jordan chains of a linear relation A in X. Consider the following statements:

  1. (i)

    The set \(\{[x_{1,n}],\ldots ,[x_{m,n}]\}\) is linearly independent in \(N(A^{n+1})/N(A^n)\).

  2. (ii)

    The set \(\{x_{k,j} : k=1,\ldots ,m, j=0,\ldots ,n\}\) is linearly independent in X.

  3. (iii)

    The set of pairs

    $$\begin{aligned} \{\{x_{k,j},x_{k,j-1}\} : k=1,\ldots ,m, j=1,\ldots ,n\}\cup \{\{x_{k,0},0\} : k=1,\ldots ,m\} \end{aligned}$$

    is linearly independent in A.

Then the following implications hold: \(\mathrm{(i)}\;\Longrightarrow \;\mathrm{(ii)}\;\Longrightarrow \;\mathrm{(iii)}\). If, in addition,

$$\begin{aligned} {\text {span}}\{x_{1,0},\ldots , x_{m,0}\}\cap M(A^n)=\{0\}, \end{aligned}$$

holds, then the three conditions \(\mathrm{(i)}\), \(\mathrm{(ii)}\), and \(\mathrm{(iii)}\) are equivalent.

Proof

The implication (ii)\(\Rightarrow \)(iii) is straightforward by use of the linear independence of the first components of the pairs in (iii). Let us prove the implication (i)\(\Rightarrow \)(ii). Assume that \(\{[x_{1,n}],\ldots ,[x_{m,n}]\}\) is linearly independent. Let \(\alpha _{k,j}\in {\mathbb {K}}\), \(j=0,\ldots ,n, k=1,\ldots ,m\), such that

$$\begin{aligned} \sum \limits _{j=0}^n \sum \limits _{k=1}^m \alpha _{k,j} x_{k,j} = 0. \end{aligned}$$
(3.2)

It is easily seen that the following tuple is a quasi-Jordan chain of A:

$$\begin{aligned} \left( \sum \limits _{j=0}^n\sum \limits _{k=1}^m \alpha _{k,j} x_{k,j}, \sum \limits _{j=1}^n\sum \limits _{k=1}^m \alpha _{k,j} x_{k,j-1},\ldots , \sum \limits _{j=n-1}^n\sum \limits _{k=1}^m \alpha _{k,j} x_{k,j-n+1}, \sum \limits _{k=1}^m \alpha _{k,n} x_{k,0} \right) . \end{aligned}$$

From this and (3.2) it follows that \(\sum _{k=1}^m \alpha _{k,n} x_{k,0}\in M(A^n)\), which, by Proposition 3.2, implies for equivalence classes in \(N(A^{n+1})/N(A^n)\)

$$\begin{aligned} \left[ \sum \limits _{j=0}^n\sum \limits _{k=1}^m \alpha _{k,j} x_{k,j}\right] =\sum \limits _{k=1}^m \alpha _{k,n} [x_{k,n}]=0. \end{aligned}$$

Hence, \(\alpha _{k,n} = 0\) for \(k=1,\ldots ,m\) and (3.2) reads as

$$\begin{aligned} \sum \limits _{j=0}^{n-1} \sum \limits _{k=1}^m \alpha _{k,j} x_{k,j} = 0. \end{aligned}$$
(3.3)

Now one can construct a quasi-Jordan chain as above starting with the sum in (3.3). Repeating the above argument shows \(\alpha _{k,n-1}=0\) for \(k=1,\ldots ,m\). Proceeding further in this manner yields (ii), since all \(\alpha _{k,j}\) in (3.2) are equal to zero.

Now assume that \({\text {span}}\{x_{1,0},\ldots , x_{m,0}\}\cap M(A^n)=\{0\}\). By Lemma 3.4,

$$\begin{aligned} \dim {\text {span}}\{[x_{1,n}],\ldots ,[x_{m,n}]\}=\dim {\text {span}}\{x_{1,0},\ldots ,x_{m,0}\}. \end{aligned}$$

We have to show that in this case (iii) implies (i). But if we assume (iii), in particular we have that \(\{x_{1,0},\ldots ,x_{m,0}\}\) is linearly independent. Therefore, \(\{[x_{1,n}],\ldots ,[x_{m,n}]\}\) is also linearly independent, completing the proof. \(\square \)

4 One-Dimensional Perturbations

The following definition, taken from [2], specifies the idea of a one-dimensional perturbation for linear relations.

Definition 4.1

Let A and B be linear relations in X. Then B is called an one-dimensional perturbation of A (and vice versa) if

$$\begin{aligned} \max \left\{ \dim \frac{A}{A\cap B},\,\dim \frac{B}{A\cap B}\right\} = 1. \end{aligned}$$

In particular, A is called a one-dimensional extension of B if \(B\subset A\) and \(\dim (A/B) = 1\).

The next lemma describes in which way (quasi-)Jordan chains of a one-dimensional extension A of a linear relation C can be linearly combined to become (quasi-)Jordan chains of C. The proof is based on the following simple principle: If M is a subspace of N and \(\dim (N/M) = 1\), then whenever \(x,y\in N\), \(y\notin M\), there exists some \(\lambda \in {\mathbb {K}}\) such that \(x - \lambda y\in M\).

Lemma 4.2

Let A and C be linear relations in X such that \(C\subset A\) and \(\dim (A/C) = 1\). If \((x_{k,n},\ldots ,x_{k,0})\), \(k=1,\ldots ,m\), are m quasi-Jordan chains of A, then after a possible reordering, there exist \(m-1\) quasi-Jordan chains \((y_{k,n},\ldots ,y_{k,0})\), \(k=1,\ldots ,m-1\), of C such that

$$\begin{aligned} y_{k,j}\in x_{k,j} + {\text {span}}\{x_{m,\ell } : \ell =0,\ldots ,j\},\qquad k=1,\ldots ,m-1,\,j=0,\ldots ,n. \end{aligned}$$

Moreover, if \(\{[x_{1,n}],\ldots , [x_{m,n}]\}\) is linearly independent in \(N(A^{n+1})/N(A^n)\) then the set \(\{[y_{1,n}],\ldots , [y_{m-1,n}]\}\) is linearly independent in \(N(C^{n+1})/N(C^n)\).

On the other hand, if the set \(\{x_{k,j} : k=1,\ldots ,m, j=0,\ldots ,n\}\) is linearly independent in X then the set \(\{y_{k,j} : k=1,\ldots ,m-1, j=0,\ldots ,n\}\) is linearly independent in X.

Proof

For any quasi-Jordan chain \((z_n,z_{n-1},\ldots ,z_0)\) of A we agree to write \({{\hat{z}}}_j = \{z_j,z_{j-1}\}\) for \(j=1,\ldots ,n\) and \({{\hat{z}}}_0 = \{z_0,0\}\). Consider the set

$$\begin{aligned} J := \{(k,j)\in \{1,\ldots ,m\}\times \{0,\ldots ,n\} : {\hat{x}}_{k,j}\notin C\}. \end{aligned}$$

If \(J=\emptyset \) then all m quasi-Jordan chains are in C and the proof is completed. Therefore, assume \(J\ne \emptyset \). Set

$$\begin{aligned} h:= \min \bigl \{j\in \{0,\ldots ,n\} : (k,j)\in J\text { for some }k\in \{1,\ldots ,m\}\}. \end{aligned}$$

Choose some \(\kappa \in \{1,\ldots ,m\}\) such that \((\kappa ,h)\in J\). After a reordering of the indices we can assume that \(\kappa = m\).

Since \({\hat{x}}_{m,h}\notin C\), there exist \(\alpha _{k,h}\in {\mathbb {K}}\), \(k=1,\ldots m-1\), such that

$$\begin{aligned} {\hat{x}}_{k,h} - \alpha _{k,h}{\hat{x}}_{m,h}\in C \end{aligned}$$

for \(k=1,\ldots m-1\). If \(h=n\), we stop here. Otherwise, there exist \(\alpha _{k,h+1}\in {\mathbb {K}}\), \(k=1,\ldots m-1\), such that

$$\begin{aligned} {\hat{x}}_{k,h+1} - \alpha _{k,h}{\hat{x}}_{m,h+1} - \alpha _{k,h+1}{\hat{x}}_{m,h}\in C \end{aligned}$$

for \(k=1,\ldots m-1\). If \(h=n-1\), the process terminates. Otherwise, there exist \(\alpha _{k,h+2}\in {\mathbb {K}}\) such that

$$\begin{aligned} {\hat{x}}_{k,h+2} - \alpha _{k,h}{\hat{x}}_{m,h+2} - \alpha _{k,h+1}{\hat{x}}_{m,h+1} - \alpha _{k,h+2}{\hat{x}}_{m,h}\in C \end{aligned}$$

for \(k=1,\ldots m-1\). We continue with this procedure up to n, where in the last step we find \(\alpha _{k,n}\in {\mathbb {K}}\) such that

$$\begin{aligned} {\hat{x}}_{k,n} - \alpha _{k,h}{\hat{x}}_{m,n} - \alpha _{k,h+1}{\hat{x}}_{m,n-1} - \ldots - \alpha _{k,n-1}{\hat{x}}_{m,h+1} - \alpha _{k,n}{\hat{x}}_{m,h}\in C \end{aligned}$$

for \(k=1,\ldots m-1\). Summarizing, we obtain numbers \(\alpha _{k,j}\in {\mathbb {K}}\), \(k=1,\ldots ,m-1\), \(j=h,\ldots ,n\), such that

$$\begin{aligned} {\hat{u}}_{k,j} := {\hat{x}}_{k,j} - \sum _{i=h}^{j}\alpha _{k,i}\,{\hat{x}}_{m,j+h-i}\;\in \;C \end{aligned}$$

for all \(k=1,\ldots ,m-1\), \(j=h,\ldots ,n\). We now define

$$\begin{aligned} y_{k,j} := x_{k,j} - \sum _{i=h}^{\min \{j+h,n\}}\alpha _{k,i}\,x_{m,j+h-i}, \end{aligned}$$

for \(k=1,\ldots m-1\) and \(j=0,\ldots ,n\). For \(0\le j < h\) (if possible, i.e., \(h>0\)),

$$\begin{aligned} {\hat{y}}_{k,j} = {\hat{x}}_{k,j} - \sum _{i=h}^{\min \{j+h,n\}}\alpha _{k,i}\,{\hat{x}}_{m,j+h-i}\,\in \,C \end{aligned}$$

is a consequence of the definition of h, whereas for \(j\ge h\) we also have

$$\begin{aligned} {\hat{y}}_{k,j} = {\hat{u}}_{k,j} - \sum _{i=j+1}^{\min \{j+h,n\}}\alpha _{k,i}\,{\hat{x}}_{m,j+h-i}\,\in \,C. \end{aligned}$$

This shows that \((y_{k,n},\ldots ,y_{k,0})\) is a quasi-Jordan chain of C for each \(k=1,\ldots ,m-1\). From the definition of \(y_{k,j}\) we also see that \(y_{k,j}\in x_{k,j} + {\text {span}}\{x_{m,j},\ldots ,x_{m,0}\}\) for all \(j=0,\ldots , n\) and \(k=1,\ldots ,m-1\).

Now, assuming the linear independence of \(\{[x_{1,n}],\ldots , [x_{m,n}]\}\) in \(N(A^{n+1})/N(A^n)\), we prove the linear independence of \(\{[y_{1,n}],\ldots , [y_{m-1,n}]\}\) in \(N(C^{n+1})/N(C^n)\). Since \(y_{k,0} = x_{k,0} - \alpha _{k,h}x_{m,0}\) for \(k=1,\ldots ,m-1\), the linear independence of \(\{y_{1,0},\ldots ,y_{m-1,0}\}\) in X easily follows from that of \(\{x_{1,0},\ldots ,x_{m,0}\}\). Furthermore,

$$\begin{aligned} {\text {span}}\{y_{1,0},\ldots ,y_{m-1,0}\}\cap M(C^n)\,\subset \,{\text {span}}\{x_{1,0},\ldots ,x_{m,0}\}\cap M(A^n), \end{aligned}$$

and the claim follows from Lemma 3.4.

Finally, assume that the set \(\{x_{k,j} : k=1,\ldots ,m, j=0,\ldots ,n\}\) is linearly independent. Also, let \(\beta _{k,j}\in {\mathbb {K}}\), \(k=1,\ldots ,m-1\), \(j=0,\ldots ,n\), such that \(\sum _{k=1}^{m-1}\sum _{j=0}^n\beta _{k,j}y_{k,j} = 0\). Then

$$\begin{aligned} 0&= \sum _{k=1}^{m-1}\sum _{j=0}^n\beta _{k,j}\left( x_{k,j} - \sum _{i=h}^{\min \{j+h,n\}}\alpha _{k,i}\,x_{m,j+h-i}\right) \\&= \sum _{k=1}^{m-1}\sum _{j=0}^n\beta _{k,j}x_{k,j} - \sum _{j=0}^n\sum _{i=h}^{\min \{j+h,n\}}\left( \sum _{k=1}^{m-1} \beta _{k,j}\alpha _{k,i}\right) x_{m,j+h-i} \end{aligned}$$

From this, we see that \(\beta _{k,j} = 0\) for \(k=1,\ldots ,m-1\) and \(j=0,\ldots ,n\). Therefore, the set \(\{y_{k,j} : k=1,\ldots ,m-1, j=0,\ldots ,n\}\) is linearly independent in X. \(\square \)

In the main result of this section, Theorem 4.5 below, we will compare the dimensions of \(N(A^{n+1})/N(A^n)\) and \(N(B^{n+1})/N(B^n)\) for two linear relations A and B that are one-dimensional perturbations of each other. To formulate it, we define the following value for two linear relations A and B in X and \(n\in {\mathbb {N}}\cup \{0\}\):

$$\begin{aligned} s_n(A,B) :=&\max \big \{\dim ({\mathcal {L}}\cap M(A^n)) : \;{\mathcal {L}}\text { is a subspace of } N(A\cap B)\cap R((A\cap B)^n),\nonumber \\&{\mathcal {L}}\cap M((A\cap B)^n) = \{0\}\big \}. \end{aligned}$$
(4.1)

The quantity \(s_n(A,B)\) can be interpreted as the number of (linearly independent) singular chains of A of length n which are not singular chains of \(A\cap B\). To justify this statement, assume that \(s_n(A,B)=r\). Then, denoting \(C=A\cap B\), there exists a subspace \({\mathcal {L}}\) of \(N(C)\cap R(C^n)\) such that \(\dim ({\mathcal {L}}\cap M(A^n))=r\) and \({\mathcal {L}}\cap M(C^n) = \{0\}\). On one hand, if \(\{x_{1,0},\ldots , x_{r,0}\}\) is a basis of \({\mathcal {L}}\cap M(A^n)\), then each \(x_{k,0}\), \(k=1,\ldots , r\), determines a quasi-Jordan chain \((x_{k,n},\ldots , x_{k,1},x_{k,0})\) of C, because \({\mathcal {L}}\subseteq N(C)\cap R(C^n)\). Also, since \({\mathcal {L}}\cap M(C^n) = \{0\}\), Lemma 3.5 implies that \(\{[x_{1,n}],\ldots ,[x_{r,n}]\}\) is linearly independent in \(N(C^{n+1})/N(C^n)\). In particular, the quasi-Jordan chains \((x_{k,n},\ldots , x_{k,1},x_{k,0})\) are not singular chains of C. On the other hand, each \(x_{k,0}\), \(k=1,\ldots , r\), determines a singular chain of A of length n because \(x_{k,0}\in M(A^n)\cap N(A)\).

Note that we always have \(s_0(A,B) = s_0(B,A) = 0\). On the other hand, for \(n\in {\mathbb {N}}\) usually we have \(s_n(A,B)\ne s_n(B,A)\). For example, if \(B\subset A\) then \(s_n(B,A) = 0\), while \(s_n(A,B)\) might be positive. Therefore, we also introduce the number

$$\begin{aligned} s_n[A,B] := \max \{s_n(A,B),s_n(B,A)\}. \end{aligned}$$

The next proposition shows that this number is bounded by n.

Proposition 4.3

Let A and B be linear relations in X such that B is a one-dimensional perturbation of A. Then for \(n\in {\mathbb {N}}\cup \{0\}\) we have

$$\begin{aligned} s_n[A,B]\,\le \, n. \end{aligned}$$

Proof

The claim is clear for \(n=0\). Let \(n\ge 1\). It obviously suffices to prove that \(s_n(A,B)\le n\). If \(A\subset B\) then \(s_n(A,B) = 0\) and the desired inequality holds. Hence, let us assume that \(\dim (A/A\cap B) = 1\) and set \(C := A\cap B\).

Let \({\mathcal {L}}\) be a subspace of \(N(C)\cap R(C^n)\) such that \({\mathcal {L}}\cap M(C^n) = \{0\}\). Towards a contradiction, suppose that \(\dim ({\mathcal {L}}\cap M(A^n)) > n\). So, there exist linearly independent vectors \(x_{1,0},\ldots ,x_{n+1,0}\in {\mathcal {L}}\cap M(A^n)\). Then there exist \(n+1\) singular chains of A of the form

$$\begin{aligned} X_k=(0,x_{k,n-1},\ldots ,x_{k,0}), \quad k=1,\ldots ,n+1, \end{aligned}$$

and \(\{X_1,\ldots ,X_{n+1}\}\) is linearly independent in \({\mathcal {S}}_n^A\), c.f. (2.3).

By Lemma 2.1, \(\dim ({\mathcal {S}}_n^A / {\mathcal {S}}_n^C)\le n\). Thus, there exists a non-trivial \(Y\in {\mathcal {S}}_n^C\) such that \(Y\in {\text {span}}\{X_1,\ldots , X_{n+1}\}\), i.e. there exist \(\alpha _1,\ldots ,\alpha _{n+1}\in \mathbb {K}\) (not all zero) such that \(Y=\sum _{k=1}^{n+1} \alpha _k X_k\).

So, Y is a non-trivial singular chain of C of the form \(Y=(0,y_{n-1},\ldots ,y_0)\), where

$$\begin{aligned} y_j=\sum _{k=1}^{n+1} \alpha _k x_{k,j}, \quad j=0,1,\ldots ,n-1. \end{aligned}$$

In particular, \(y_0=\sum _{k=1}^{n+1} \alpha _k x_{k,0}\ne 0\) because \(\{x_{1,0},\ldots ,x_{n+1,0}\}\) is linearly independent. Now, since \(x_{1,0},\ldots ,x_{n+1,0}\in {\mathcal {L}}\), also \(y_0\in {\mathcal {L}}\) and hence \(y_0\in {\mathcal {L}}\cap M(C^n)\), which is the desired contradiction. \(\square \)

We now present our first generalization of Theorem 2.2 in [5]. In this case we assume that one of the two relations is a one-dimensional extension of the other.

Theorem 4.4

Let A and B be linear relations in X such that \(A\subset B\) and \(\dim (B/A) = 1\) and let \(n\in {\mathbb {N}}\cup \{0\}\). Then the following holds:

  1. (i)

    \(N(A^{n+1})/N(A^n)\) is finite-dimensional if and only if \(N(B^{n+1})/N(B^n)\) is finite-dimensional. Moreover,

    $$\begin{aligned} -s_{n}(B,A)\,\le \,\dim \frac{N(B^{n+1})}{N(B^n)} -\dim \frac{N(A^{n+1})}{N(A^n)}\,\le \,1. \end{aligned}$$

    In particular, for \(n\ge 1\) we have

    $$\begin{aligned} \left| \dim \frac{N(B^{n+1})}{N(B^n)} - \dim \frac{N(A^{n+1})}{N(A^n)}\right| \,\le \,\max \{1,s_n(B,A)\}\,\le \,n. \end{aligned}$$
    (4.2)
  2. (ii)

    \(N(A^n)\) is finite-dimensional if and only if \(N(B^n)\) is finite-dimensional. Moreover, for \(n\ge 1\),

    $$\begin{aligned} \left| \dim N(B^n) - \dim N(A^n)\right| \,\le \,\sum _{k=0}^{n-1} \max \left\{ 1, s_k(B,A)\right\} \,\le \,\frac{(n-1) n}{2} +1. \end{aligned}$$

Proof

To prove the lower bound in item (i), suppose that there are

$$\begin{aligned} m: = \dim \frac{N(B^{n+1})}{N(B^n)} + s_n(B,A) + 1 \end{aligned}$$

linearly independent vectors \([x_{1,n}],\ldots ,[x_{m,n}]\) in \(N(A^{n+1})/N(A^n)\) and consider corresponding Jordan chains \((x_{k,n},\ldots ,x_{k,0})\) of length \(n+1\) of A, \(k=1,\ldots ,m\). By Lemma 3.4, the vectors \(x_{1,0},\ldots , x_{m,0}\) are linearly independent and, if \({\mathcal {L}}_0 := {\text {span}}\{x_{1,0},\ldots ,x_{m,0}\}\) then

$$\begin{aligned} {\mathcal {L}}_0\cap M(A^n) = \{0\}. \end{aligned}$$

Denote the cosets of the vectors \(x_{k,n}\) in \(N(B^{n+1})/N(B^n)\) by \([x_{k,n}]_B\), \(k=1,\ldots ,m\). Since

$$\begin{aligned} s_n(B,A) = \max \left\{ \dim ({\mathcal {L}}\cap M(B^n)) : {\mathcal {L}}\subset N(A)\cap R(A^n)\text { subspace},\,{\mathcal {L}}\cap M(A^n) = \{0\}\right\} , \end{aligned}$$

Lemma 3.4 implies that

$$\begin{aligned} \dim {\text {span}}\{[x_{1,n}]_B,\ldots ,[x_{m,n}]_B\}&= m - \dim ({\mathcal {L}}_0\cap M(B^n))\\&\ge m - s_n(B,A) = \dim \frac{N(B^{n+1})}{N(B^n)} + 1, \end{aligned}$$

which is a contradiction.

On the other hand, assume that there are

$$\begin{aligned} p:= \dim \frac{N(A^{n+1})}{N(A^n)} +2 \end{aligned}$$

linearly independent vectors \([y_{1,n}]_B,\ldots , [y_{p,n}]_B\) in \(\frac{N(B^{n+1})}{N(B^n)}\) and consider corresponding Jordan chains \((y_{k,n}, \ldots , y_{k,0})\) of length \(n+1\) of B, for \(k=1,\ldots ,p\). By Lemma 3.4, the vectors \(y_{1,0},\ldots , y_{p,0}\) are linearly independent and, if \({\mathcal {L}}_Y := {\text {span}}\{y_{1,0},\ldots ,y_{p,0}\}\), then

$$\begin{aligned} {\mathcal {L}}_Y\cap M(B^n) = \{0\}. \end{aligned}$$

Now, applying Lemma 4.2, we obtain \(p-1\) Jordan chains \((z_{k,n}, \ldots , z_{k,0})\) of length \(n+1\) of A, \(k=1,\ldots ,p-1\), such that (after a possible reordering)

$$\begin{aligned} z_{k,j}\in y_{k,j} + {\text {span}}\{y_{p,l}: l=0,\ldots ,j\} \quad \text {for} \ k=1,\ldots ,p-1,\ j=0,\ldots ,n. \end{aligned}$$

In particular, for each \(k=1,\ldots ,p-1\) there exists \(\alpha _k\in \mathbb {K}\) such that \(z_{k,0}=y_{k,0}+ \alpha _k y_{p,0}\).

Hence, if \({\mathcal {L}}_Z:={\text {span}}\{z_{1,0},\ldots , z_{p-1,0}\}\) it is easy to see that

$$\begin{aligned} {\mathcal {L}}_Z\cap M(A^n)=\{0\}, \end{aligned}$$

because \({\mathcal {L}}_Z\subseteq {\mathcal {L}}_Y\), \(M(A^n)\subseteq M(B^n)\) and \({\mathcal {L}}_Y\cap M(B^n)=\{0\}\). Thus, by Lemma 3.4,

$$\begin{aligned} \dim {\text {span}}\{[z_{1,n}],\ldots ,[z_{p-1,n}]\}= \dim {\mathcal {L}}_Z=p-1=\dim \frac{N(A^{n+1})}{N(A^n)} + 1, \end{aligned}$$

which is a contradiction.

In order to prove item (ii), note that for a linear relation T we have

$$\begin{aligned} N(T^n)= N(T)\oplus W_1\oplus \dots \oplus W_{n-1}, \end{aligned}$$

where \(W_j\) is a subspace of \(N(T^n)\) isomorphic to \(\frac{N(T^{j+1})}{N(T^{j})}\) for \(j=1,\ldots , n-1\). This fact follows easily by induction on n. Hence, from item (i) we infer that \(\dim N(A^n) < \infty \) if and only if \(\dim N(B^n) < \infty \). Also, as a consequence of (4.2) and Proposition 4.3,

$$\begin{aligned} \left| \dim N(B^n) - \dim N(A^n)\right|&= \left| \sum _{k=0}^{n-1}\dim \frac{N(B^{k+1})}{N(B^k)} - \sum _{k=0}^{n-1}\dim \frac{N(A^{k+1})}{N(A^k)}\right| \\&\le \sum _{k=0}^{n-1}\left| \dim \frac{N(B^{k+1})}{N(B^k)} - \dim \frac{N(A^{k+1})}{N(A^k)}\right| \\&\le \sum _{k=0}^{n-1} \max \{1,s_k(B,A)\} = 1 + \sum _{k=1}^{n-1} k \\&\le \, 1+\frac{(n-1) n}{2}. \end{aligned}$$

This concludes the proof of the theorem. \(\square \)

The next theorem is the main result of this section. It states that the estimate obtained in [5, Theorem 2.2] for operators have to be adjusted when considering arbitrary linear relations. Note that \(s_n[A,B] = 0\) for operators A and B.

Theorem 4.5

Let A and B be linear relations in X such that B is a one-dimensional perturbation of A and \(n\in {\mathbb {N}}\cup \{0\}\). Then the following hold:

  1. (i)

    \(N(A^{n+1})/N(A^n)\) is finite-dimensional if and only if \(N(B^{n+1})/N(B^n)\) is finite-dimensional. Moreover,

    $$\begin{aligned} -1-s_{n}(B,A)\,\le \,\dim \frac{N(B^{n+1})}{N(B^n)} - \dim \frac{N(A^{n+1})}{N(A^n)}\,\le \,1 + s_n(A,B). \end{aligned}$$

    In particular,

    $$\begin{aligned} \left| \dim \frac{N(B^{n+1})}{N(B^n)} - \dim \frac{N(A^{n+1})}{N(A^n)}\right| \,\le \,1 + s_n[A,B]\,\le \,n+1. \end{aligned}$$
    (4.3)
  2. (ii)

    \(N(A^n)\) is finite-dimensional if and only if \(N(B^n)\) is finite-dimensional. Moreover,

    $$\begin{aligned} \left| \dim N(B^n) - \dim N(A^n)\right| \,\le \,n + \sum _{k=0}^{n-1}s_k[A,B]\,\le \,\frac{n(n+1)}{2}. \end{aligned}$$

Proof

Define \(C := A\cap B\). Then \(C\subset A\) and \(C\subset B\) as well as \(\dim (A/C)\le 1\) and \(\dim (B/C)\le 1\). Moreover, note that

$$\begin{aligned} s_n(A,B) = s_n(A,C) \quad \text {and}\quad s_n(B,A) = s_n(B,C). \end{aligned}$$

Therefore, using the notation \(D_n(T) = \dim \tfrac{N(T^{n+1})}{N(T^n)}\) for a relation T in X, from Theorem 4.4 we obtain

$$\begin{aligned} D_n(B) - D_n(A) = (D_n(B)-D_n(C)) - (D_n(A) - D_n(C))\,\le \,1 + s_n(A,B) \end{aligned}$$

Exchanging the roles of A and B leads to \(D_n(A)-D_n(B)\le 1+s_n(B,A)\). This proves (i).

The proof of statement (ii) is analogous to the proof of its counterpart in Theorem 4.4. In this case, as a consequence of (4.3),

$$\begin{aligned} \left| \dim N(B^n) - \dim N(A^n)\right| \le \sum _{k=0}^{n-1}\left| D_k(A) - D_k(B)\right| \,\le \, \sum _{k=0}^{n-1}(1+s_k[A,B])\,\le \,\frac{n(n+1)}{2}, \end{aligned}$$

and the theorem is proved. \(\square \)

In Sect. 5 below we prove that the bound \(n+1\) in (4.3) of Theorem 4.5 is in fact sharp, meaning that there are examples of linear relations A and B which are one-dimensional perturbations of each other where the quantity on the left hand side of (4.3) coincides with \(n+1\).

The following corollary deals with linear relations without singular chains. If neither A nor B has singular chains then we recover the bounds from the operator case, see Theorem 2.2 in [5].

Corollary 4.6

Let A and B be linear relations in X without singular chains such that B is a one-dimensional perturbation of A. Then the following statements hold:

  1. (i)

    \(N(A^{n+1})/N(A^n)\) is finite dimensional if and only if \(N(B^{n+1})/N(B^n)\) is finite dimensional. Moreover,

    $$\begin{aligned} \left| \dim \frac{N(A^{n+1})}{N(A^n)} - \dim \frac{N(B^{n+1})}{N(B^n)}\right| \,\le \,1. \end{aligned}$$
  2. (ii)

    \(N(A^n)\) is finite dimensional if and only if \(N(B^n)\) is finite dimensional. Moreover,

    $$\begin{aligned} \left| \dim N (A^n) - \dim N(B^n)\right| \,\le \,n. \end{aligned}$$
  3. (iii)

    \(N(A)\cap R(A^n)\) is finite dimensional if and only if \(N(B)\cap R(B^n)\) is finite dimensional. Moreover,

    $$\begin{aligned} \left| \dim (N(A)\cap R(A^n)) - \dim (N(B)\cap R(B^n))\right| \,\le \,1. \end{aligned}$$

Proof

If A and B are linear relations in X without singular chains, then \(s_n[A,B]=0\) for each \(n\in {\mathbb {N}}\). Therefore, items (i) and (ii) follow directly from items (i) and (ii) in Theorem 4.5. Finally, recall that for a linear relation T in X without singular chains we have \(N(T^{n+1})/N(T^n)\cong N(T)\cap R(T^n)\), c.f. [42, Lemma 4.4]. Hence, (iii) follows from (i). \(\square \)

5 Sharpness of the Bound in Theorem 4.5

In this section we present an example which shows that the bound \(n+1\) in Theorem 4.5 can indeed be achieved and is therefore sharp. This is easy to see in the cases \(n=0\) and \(n=1\).

Example 5.1

(a) Let \(n=2\), and let \(x_0,x_1,x_2,z_0,z_1,z_2,y_1,y_2,y_3\) be linearly independent vectors in X. Define the linear relations

$$\begin{aligned} A = {\text {span}}\big \{&\{x_{2},x_{1}\},\{x_{1},x_{0}\},\{x_{0},0\},\\&\{z_{2},z_{1}\},\{z_{1},z_{0}\},\{z_{0},0\},\\&\varvec{\{}\varvec{y_3}\varvec{,}\varvec{x_{2}} \varvec{-}\varvec{y_2} \varvec{\}},\{x_{2}-y_2,y_1\},\{y_1,0\},\\&\{z_{2},y_2\}\big \} \end{aligned}$$

and

$$\begin{aligned} B = {\text {span}}\big \{&\{x_{2},x_{1}\},\{x_{1},x_{0}\},\{x_{0},0\},\\&\{z_{2},z_{1}\},\{z_{1},z_{0}\},\{z_{0},0\},\\&\{x_{2}-y_2,y_1\},\{y_1,0\},\\&\{z_{2},y_2\},\varvec{\{}\varvec{y_2}\varvec{,}\varvec{0} \varvec{\}}\big \}. \end{aligned}$$

All pairs are contained in both A and B except for the two pairs \(\varvec{\{}\varvec{y_3}\varvec{,}\varvec{x_{2}} \varvec{-}\varvec{y_2} \varvec{\}}\) and \(\varvec{\{}\varvec{y_2}\varvec{,}\varvec{0}\varvec{\}}\) which are printed here in bold face. Therefore, A and B are one-dimensional perturbations of each other. It is easy to see that \(M(A^2) = {\text {span}}\{y_2-z_1, x_1-y_1-z_0\}\) and thus \(M(A^2)\cap {\text {span}}\{x_0,z_0,y_1\}=\{0\}\). By Lemma 3.5, it follows that \([x_{2}]_A,[z_{2}]_A,[y_3]_A\) are linearly independent in \(N(A^3)/N(A^2)\). As \(N(B^2)={\text {span}}\{x_0,x_1,x_2,z_0,z_1,z_2,y_1,y_2\}\) it is clear that \(N(B^3) = N(B^2)\), hence

$$\begin{aligned} \dim \frac{N(A^3)}{N(A^2)} - \dim \frac{N(B^3)}{N(B^2)} = 3 - 0 = 3 = n+1. \end{aligned}$$

(b) Let \(n\in {\mathbb {N}}\), \(n>2\). For our example we need \((n+1)^2\) linearly independent vectors in the linear space X, say \(x_{i,j}\) for \(i=1,\ldots ,n\) and \(j=0,\ldots ,n\) as well as \(y_1,\ldots ,y_{n+1}\). Let us consider the linear relation

$$\begin{aligned} A =&{\text {span}}\left[ \left\{ \{x_{k,n},x_{k,n-1}\},\ldots ,\{x_{k,1},x_{k,0}\}, \{x_{k,0},0\} : k=1,\ldots ,n\right\} \,\right. \\&\cup \, \{y_{n+1},x_{1,n}-y_n\}\,\cup \,\big \{\{x_{k,n}-y_{n-k+1},x_{k+1,n}-y_{n-k}\} : k=1,\ldots ,n-2\big \}\\&\left. \cup \,\{x_{n-1, n}-y_2,y_1\}\cup \,\{y_1,0\}\cup \{x_{n,n},y_n\}\cup \big \{\{y_l,y_{l-1}\}: l=3,\ldots ,n\big \} \right] . \end{aligned}$$

Notice that

$$\begin{aligned} N(A)={\text {span}}\{x_{1,0},\ldots ,x_{n,0}, y_1\}. \end{aligned}$$

In the following we compute the multivalued part of \(A^k\) for \(k=1,\ldots , n\). Assume that \(x\in M(A)\subset R(A)\). Then \(\{0, x\}\in A\) and there exist scalars \(\alpha _{i,j}, \beta _k, \gamma _l\in {\mathbb {K}}\) such that

$$\begin{aligned} x=\sum _{i=1}^n\sum _{j=1}^n \alpha _{i,j}x_{i,j-1} + \sum _{k=1}^{n-2}\gamma _k(x_{k+1,n}-y_{n-k}) + \gamma _{n-1} y_1 + \gamma _n y_n + \beta _n (x_{1,n}-y_n) + \sum _{l=2}^{n-1}\beta _l y_l \end{aligned}$$

and

$$\begin{aligned} 0&= \sum _{i=1}^n\sum _{j=1}^n \alpha _{i,j}x_{i,j}+ \sum _{k=1}^{n-2}\gamma _k(x_{k,n}-y_{n-k+1}) + \gamma _{n-1}(x_{n-1,n}-y_2) + \gamma _n x_{n,n} + \sum _{l=2}^{n}\beta _l y_{l+1} \\&= \sum _{i=1}^n (\alpha _{i,n} + \gamma _i)x_{i,n} + \sum _{i=1}^n\sum _{j=1}^{n-1} \alpha _{i,j}x_{i,j} + \beta _n y_{n+1} + \sum _{k=1}^{n-2}(\beta _{n-k}-\gamma _k) y_{n-k+1} - \gamma _{n-1}y_2. \end{aligned}$$

Therefore,

$$\begin{aligned} \left\{ \begin{array}{ll} \alpha _{i,n} + \gamma _i=0 &{} \text {for}\ i=1,\ldots ,n,\\ \alpha _{i,j}=0&{} \text {for}\ i=1,\ldots ,n, \ j=1,\ldots , n-1, \\ \beta _n=0 &{} , \\ \gamma _k-\beta _{n-k}=0 &{} \text {for}\ k=1,\ldots , n-2,\\ \gamma _{n-1}=0 &{} . \end{array} \right. \end{aligned}$$

Hence, we can rewrite the vector x as

$$\begin{aligned} x&= \sum _{i=1}^{n-2}\alpha _{i,n}x_{i,n-1} + \alpha _{n,n}x_{n,n-1} + \sum _{k=1}^{n-2}\gamma _k(x_{k+1,n}-y_{n-k}) + \gamma _n y_n + \sum _{l=2}^{n-1}\beta _l y_l \\&=\sum _{k=1}^{n-2}\gamma _k (x_{k+1,n}-x_{k,n-1}) + \gamma _n (y_n-x_{n,n-1}). \end{aligned}$$

Thus,

$$\begin{aligned} M(A)={\text {span}}\left( \{y_n-x_{n,n-1}\}\cup \big \{x_{k+1,n}-x_{k,n-1}: \ k=1,\ldots , n-2\big \}\right) . \end{aligned}$$

If \(x\in M(A^2)\), then there exists \(y\in M(A)\) such that \(\{y,x\}\in A\). Hence, if \(y=\sum _{k=1}^{n-2}\alpha _k (x_{k+1,n}-x_{k,n-1}) + \alpha _{n-1}(y_n-x_{n,n-1})\) then

$$\begin{aligned} x-\sum _{k=1}^{n-2}\alpha _k (x_{k+1,n-1} - x_{k,n-2}) -\alpha _{n-1}(y_{n-1}-x_{n,n-2}) \in M(A). \end{aligned}$$

Therefore,

$$\begin{aligned} M(A^2)={\text {span}}&\left( \{y_n-x_{n,n-1}\}\cup \big \{x_{k+1,n}-x_{k,n-1}: \ k=1,\ldots , n-2\big \}\right. \\&\cup \left. \{y_{n-1}-x_{n,n-2}\}\cup \big \{x_{k+1,n-1}-x_{k,n-2}: \ k=1,\ldots , n-2\big \}\right) . \end{aligned}$$

Following the same arguments it can be shown that

$$\begin{aligned} M(A^{n-1})={\text {span}}&\left( \big \{x_{k+1,n-j}-x_{k,n-j-1}: \ k=1,\ldots , n-2,\ j=0,\ldots , n-2\big \}\right. \\&\cup \left. \{y_{n-j}-x_{n,n-j-1}: \ j=0,\ldots , n-2\big \}\right) . \end{aligned}$$

and

$$\begin{aligned} M(A^n)={\text {span}}&\left( \big \{x_{k+1,n-j}-x_{k,n-j-1}: \ k=1,\ldots , n-2,\ j=0, \ldots , n-1\big \} \right. \\&\cup \left. \{y_{n-j}-x_{n,n-j-1}: \ j=0,\ldots , n-2\big \}\cup \{x_{n-1,n-1}-y_1-x_{n,0}\}\right) . \end{aligned}$$

From this it follows that

$$\begin{aligned} {\text {span}}\{x_{1,0},\ldots ,x_{n,0}, y_1\}\cap M(A^n)=\{0\}. \end{aligned}$$
(5.1)

Indeed, if x is a vector contained in the set on the left hand side of (5.1), then

$$\begin{aligned} x = \alpha _1x_{1,0} + \dots + \alpha _nx_{n,0} + \alpha _{n+1}y_1 = \sum _{k=1}^{n-2}\beta _k(x_{k+1,1} - x_{k,0}) + \gamma (x_{n-1,n-1}-y_1-x_{n,0}), \end{aligned}$$

where \(\alpha _j,\beta _k,\gamma \in {\mathbb {K}}\) for \(j=1,\ldots ,n+1\) and \(k=1,\ldots ,n-2\). This implies

$$\begin{aligned}&\sum _{k=1}^{n-2}(\alpha _k + \beta _k)x_{k,0} + \alpha _{n-1}x_{n-1,0} + (\alpha _n+\gamma )x_{n,0} + (\alpha _{n+1}+\gamma )y_1\\&\qquad - \sum _{k=1}^{n-2}\beta _kx_{k+1,1} - \gamma x_{n-1,n-1} = 0. \end{aligned}$$

Since all the vectors involved are by assumption linearly independent, it follows that \(\gamma = 0\) and also \(\beta _k = 0\) for \(k=1,\ldots ,n-2\) and thus also \(\alpha _j = 0\) for all \(j=1,\ldots ,n+1\). That is, \(x = 0\).

Now, it follows from (5.1) and Lemma 3.5 that \([x_{1,n}]_A,\ldots ,[x_{n,n}]_A, [y_{n+1}]_A\) are linearly independent in \(N(A^{n+1})/N(A^n)\). On the other hand, if we consider the linear relation

$$\begin{aligned} B&={\text {span}}\left( \big \{ \{x_{k,j}, x_{k,j-1}\} : k=1,\ldots ,n,\,j=1,\ldots ,n \big \} \cup \, \big \{\{x_{k,0},0\} : k=1,\ldots ,n \big \}\right. \\&\cup \, \big \{\{x_{k,n}-y_{n-k+1},x_{k+1,n}-y_{n-k}\} : k=1,\ldots ,n-2\big \}\cup \{x_{n-1, n}-y_2,y_1\}\cup \,\{y_1,0\}\\&\left. \cup \big \{ \{x_{n,n},y_n\}, \{y_n,y_{n-1}\},\ldots ,\{y_3,y_2\}\cup \{y_2,0\}\big \} \right) , \end{aligned}$$

A and B are one-dimensional perturbations of each other. Also, it is straightforward to verify that \(D(B) = N(B^n)\). In particular, \(N(B^{n+1}) = N(B^n)\) so that

$$\begin{aligned} \dim \frac{N(A^{n+1})}{N(A^n)} - \dim \frac{N(B^{n+1})}{N(B^n)} = n+1 - 0 = n+1, \end{aligned}$$

which shows that the worst possible bound is indeed achieved in this example.

6 Finite-Dimensional Perturbations

A linear relation B is a finite dimensional perturbation of another linear relation A if both differ by finitely many dimensions from their intersection. Following [2], we formalize this idea as follows.

Definition 6.1

Let A and B be linear relations in X and \(p\in {\mathbb {N}}\). Then B is called a p-dimensional perturbation of A (and vice versa) if

$$\begin{aligned} \max \left\{ \dim \frac{A}{A\cap B},\,\dim \frac{B}{A\cap B}\right\} = p. \end{aligned}$$

Remark 6.2

Let A and B be linear relations in X which are p-dimensional perturbations of each other, \(p>1\). Then it is possible to construct a sequence of one-dimensional perturbations, starting in A and ending in B. Indeed, choose \(\{{\widehat{f}}_1,\ldots , \widehat{f}_p\}\) and \(\{{\widehat{g}}_1,\ldots ,{\widehat{g}}_p\}\) in \(X\times X\) such that

$$\begin{aligned} A = (A\cap B)\dotplus {\text {span}}\{{\widehat{f}}_1,\ldots ,{\widehat{f}}_{p}\}\quad \text{ and } \quad B= (A\cap B)\dotplus {\text {span}}\{\widehat{g}_1,\ldots ,{\widehat{g}}_{p}\}. \end{aligned}$$

Observe that \(\{{\widehat{f}}_1,\ldots ,{\widehat{f}}_p\}\) is linearly independent if and only if \( \dim \frac{A}{A\cap B}=p\). Otherwise, some of the elements of \(\{{\widehat{f}}_1,\ldots ,{\widehat{f}}_p\}\) can be chosen as zero. An analogous statement holds for \(\{\widehat{g}_1,\ldots ,{\widehat{g}}_p\}\). Define \(C_0:=A\) , \(C_p:=B\), and

$$\begin{aligned} C_k:= (A\cap B)\dotplus {\text {span}}\{{\widehat{f}}_1,\ldots ,{\widehat{f}}_{p-k}, {\widehat{g}}_{p-k+1},\ldots ,{\widehat{g}}_p\},\quad k=1,\ldots ,p-1. \end{aligned}$$

Obviously, \(C_{k+1}\) is a one-dimensional perturbation of \(C_k\), \(k=0,\ldots ,p-1\). If, in addition, \(A\subset B\) is satisfied, then \({\widehat{f}}_{j}=0\) for all \(j=1,\ldots ,p\) holds and we obtain

$$\begin{aligned} A\subset C_j \subset C_{j+1} \subset B \quad \text{ for } j=1,\ldots ,p-1. \end{aligned}$$

Theorem 6.3

Let A and B be linear relations in X such that B is a p-dimensional perturbation of A, \(p\ge 1\), and \(n\in {\mathbb {N}}\cup \{0\}\). Then the following conditions hold:

  1. (i)

    \(N(A^{n+1})/N(A^n)\) is finite-dimensional if and only if \(N(B^{n+1})/N(B^n)\) is finite-dimensional. Moreover,

    $$\begin{aligned} \left| \dim \frac{N(A^{n+1})}{N(A^n)} - \dim \frac{N(B^{n+1})}{N(B^n)}\right| \,\le \,(n+1)p. \end{aligned}$$
  2. (ii)

    If, in addition in item (i), \(A\subset B\) is satisfied, then we have for \(n\ge 1\)

    $$\begin{aligned} \left| \dim \frac{N(A^{n+1})}{N(A^n)} - \dim \frac{N(B^{n+1})}{N(B^n)}\right| \,\le \,np. \end{aligned}$$
  3. (iii)

    \(N(A^n)\) is finite-dimensional if and only if \(N(B^n)\) is finite-dimensional. Moreover,

    $$\begin{aligned} \left| \dim N(A^n) - \dim N(B^n)\right| \,\le \,\frac{n(n+1)}{2}p. \end{aligned}$$
  4. (iv)

    If, in addition in item (iii), \(A\subset B\) is satisfied, then we have for \(n\ge 1\)

    $$\begin{aligned} \left| \dim N(A^n) - \dim N(B^n)\right| \,\le \,\frac{n(n-1)}{2}p+p. \end{aligned}$$

Proof

By Remark 6.2 there exist linear relations \(C_0,\ldots ,C_p\) in X with \(C_0 = A\) and \(C_p = B\) such that \(C_{k+1}\) is a one-dimensional perturbation of \(C_k\), \(k=0,\ldots ,p-1\). Hence, applying item (i) in Theorem 4.5 repeatedly, we obtain

$$\begin{aligned} \left| \dim \frac{N(B^{n+1})}{N(B^n)} - \dim \frac{N(A^{n+1})}{N(A^n)}\right| \le \sum _{k=0}^{p-1}\left| \dim \frac{N(C_{k+1}^{n+1})}{N(C_{k+1}^n)} - \dim \frac{N(C_k^{n+1})}{N(C_k^n)}\right| \,\le \,(n+1)p. \end{aligned}$$

Also, applying item (ii) in Theorem 4.5 repeatedly,

$$\begin{aligned} \left| \dim N(A^n) - \dim N(B^n)\right| \le \sum _{k=0}^{p-1}\left| \dim N(C_{k+1}^n) - \dim N(C_k^n)\right| \,\le \,\frac{n(n+1)}{2}p, \end{aligned}$$

which shows (iii). The statements (ii) and (iv) in the case \(A\subset B\) follows in the same way from Remark 6.2 and Theorem 4.4. \(\square \)

For linear relations A and B without singular chains we obtain the same (sharp) estimates as for operators, see [5].

Corollary 6.4

Let A and B be linear relations in X without singular chains such that B is a p-dimensional perturbation of A, \(p\ge 1\). Then the following conditions hold:

  1. (i)

    \(N(A^{n+1})/N(A^n)\) is finite-dimensional if and only if \(N(B^{n+1})/N(B^n)\) is finite-dimensional. Moreover,

    $$\begin{aligned} \left| \dim \frac{N(A^{n+1})}{N(A^n)} - \dim \frac{N(B^{n+1})}{N(B^n)}\right| \,\le \,p. \end{aligned}$$
  2. (ii)

    \(N(A^n)\) is finite-dimensional if and only if \(N(B^n)\) is finite-dimensional. Moreover,

    $$\begin{aligned} \left| \dim N (A^n) - \dim N(B^n)\right| \,\le \,np. \end{aligned}$$
  3. (iii)

    \(N(A)\cap R(A^n)\) is finite-dimensional if and only if \(N(B)\cap R(B^n)\) is finite-dimensional. Moreover,

    $$\begin{aligned} \left| \dim (N(A)\cap R(A^n)) - \dim (N(B)\cap R(B^n))\right| \,\le \,p. \end{aligned}$$

Proof

The claims follow immediately applying repeatedly the results in Corollary 4.6 to the finite sequence of one-dimensional prturbations \(A=C_0, C_1,\ldots , C_p=B\). \(\square \)

7 Rank-One Perturbations of Matrix Pencils

In this section we apply our results to matrix pencils P of the form

$$\begin{aligned} P(s):= sE-F, \end{aligned}$$

where \(s\in {\mathbb {C}}\) and E, F are square matrices in \(\mathbb C^{d\times d}\). We will estimate the change of the number of Jordan chains of P under a perturbation with a rank-one matrix pencil.

We do not assume E to be invertible. Nevertheless, if we identify E with the linear relation given by the graph of E then we have an inverse \(E^{-1}\) of E in the sense of linear relations, see (2.1). Also, we have that

$$\begin{aligned} E^{-1}F = \left\{ \{x,y\}\in {\mathbb {C}}^d\times {\mathbb {C}}^d : Fx=Ey \right\} = N [F \ -E]. \end{aligned}$$

Recall that \(\lambda \in \mathbb {C}\) is an eigenvalue of \(P(s)=sE-F\) if zero is an eigenvalue of \(P(\lambda )\), and \(\infty \) is an eigenvalue of P if zero is an eigenvalue of the dual matrix pencil \(({{\,\mathrm{rev}\,}}P)(s)=sF-E\). In the following we recall the notion of Jordan chains for matrix pencils, see e.g. [24, Section 1.4], [30], or [35, §11.2].

Definition 7.1

An ordered set \((x_n, \ldots ,x_{0})\) in \({\mathbb {C}}^d\) is a Jordan chain of length \(n+1\) at \(\lambda \in \overline{{\mathbb {C}}}:={\mathbb {C}}\cup \{\infty \}\) (for the matrix pencil P(s)) if \(x_{0}\ne 0\) and

$$\begin{aligned} \begin{array}{lrrrr} \lambda \in {\mathbb {C}}: &{} (F -\lambda E)x_0=0, &{} (F-\lambda E)x_1 = E x_0, &{} \ldots , &{} (F-\lambda E)x_n = E x_{n-1}, \\[2mm] \lambda =\infty : &{} E x_0 = 0, &{} E x_1 = F x_0, &{} \ldots ,&{} E x_n = F x_{n-1}. \end{array} \end{aligned}$$

Moreover, we denote by \({\mathcal {L}}_{\lambda }^l(P)\) the subspace spanned by the vectors of all Jordan chains up to length \(l\ge 1\) at \(\lambda \in \overline{{\mathbb {C}}}\). If \(l=0\) or if \(\lambda \) is not an eigenvalue of P we define \({\mathcal {L}}_{\lambda }^l(P)=\{0\}\).

Remark 7.2

As mentioned above, Definition 7.1 is inspired by the definition of eigenvalues of pencils introduced in 1951 by M.V. Keldysh who used this concept in his study of operator pencils, see [30, 35]. This definition fits to the definition of eigenvalues of linear relations and in that way to the purpose of this paper.

The authors are aware of the fact that in many recent publications in the matrix pencil community, see for instance [4, 13,14,15, 37, 44], a different definition for eigenvalues of matrix pencils is used which is based on changes in the rank of P(s). What in our paper is called an eigenvalue is there sometimes called a singular point. However, this concept is not used in the community of linear relations, so we apologize and warn for possible misunderstandings.

Given a matrix pencil P(s), the aim of this section is to obtain lower and upper bounds for the difference

$$\begin{aligned} \dim \frac{{\mathcal {L}}_\lambda ^{n+1}(P +Q)}{{\mathcal {L}}_\lambda ^{n}(P+Q)} - \dim \frac{{\mathcal {L}}_\lambda ^{n+1}(P)}{{\mathcal {L}}_\lambda ^{n}(P)}, \end{aligned}$$

where Q is a rank-one matrix pencil, \(n\in {\mathbb {N}}\cup \{0\}\) and \(\lambda \in \overline{\mathbb {C}}\).

We start with a simple lemma, which follows directly from the definitions. It allows us to reduce the study of Jordan chains at some \(\lambda \in \overline{{\mathbb {C}}}\) to Jordan chains at zero.

Lemma 7.3

Given a matrix pencil \(P(s)=sE-F\), the following statements hold:

  1. (i)

    \((x_n, \ldots ,x_{0})\) is a Jordan chain of P at \(\lambda \in {\mathbb {C}}\) if and only if it is a Jordan chain of the matrix pencil \({\tilde{P}}(s):=sE-(F-\lambda E)\) at zero.

  2. (ii)

    \((x_n, \ldots ,x_{0})\) is a Jordan chain of P(s) at \(\infty \) if and only if it is a Jordan chain of the dual matrix pencil \(({{\,\mathrm{rev}\,}}P)(s):= sF- E\) at zero.

The following proposition shows that the Jordan chains of the matrix pencil P(s) coincide with the Jordan chains of the linear relation \(E^{-1}F\). As the proof is simple and straightforward, we omit it.

Proposition 7.4

For \(n\in {\mathbb {N}}\cup \{0\}\) and \(\lambda \in \overline{{\mathbb {C}}}\) the following two statements are equivalent.

  1. (i)

    \((x_n, \ldots ,x_{0})\) is a Jordan chain of P at \(\lambda \).

  2. (ii)

    \((x_n, \ldots ,x_{0})\) is a quasi-Jordan chain of \(E^{-1}F\) at \(\lambda \).

In particular, for \(\lambda \in {\mathbb {C}}\) we have

$$\begin{aligned} {\mathcal {L}}_\lambda ^n(P) = N((E^{-1}F - \lambda )^n). \end{aligned}$$

Note that the quasi-Jordan chains of a linear relation A at \(\infty \) are the same as the quasi-Jordan chains of the inverse linear relation \(A^{-1}\) at zero. Moreover, it is easy to see that \(E^{-1}F=(F^{-1}E)^{-1}\). Therefore,

Corollary 7.5

\((x_n, \ldots ,x_{0})\) is a Jordan chain of \(P(s)=sE-F\) at \(\infty \) if and only if \((x_n, \ldots ,x_{0})\) is a quasi-Jordan chain of \(F^{-1}E\) at zero. In particular,

$$\begin{aligned} {\mathcal {L}}_\infty ^n(P) = M((E^{-1}F)^n)=(N((F^{-1}E)^n)={\mathcal {L}}_0^n({{\,\mathrm{rev}\,}}P). \end{aligned}$$

Due to Proposition 7.4, for \(n\in {\mathbb {N}}\cup \{0\}\) and \(\lambda \in {\mathbb {C}}\) we have

$$\begin{aligned} \dim \frac{{\mathcal {L}}_\lambda ^{n+1}(P)}{{\mathcal {L}}_\lambda ^{n}(P)} = \dim \frac{N((E^{-1}F-\lambda )^{n+1})}{N((E^{-1}F-\lambda )^{n})}. \end{aligned}$$

On the other hand, Corollary 7.5 implies that

$$\begin{aligned} \dim \frac{{\mathcal {L}}_\infty ^{n+1}(P)}{{\mathcal {L}}_\infty ^{n}(P)} = \dim \frac{N((F^{-1}E)^{n+1})}{N((F^{-1}E)^{n})}. \end{aligned}$$

For a given matrix pencil \(P(s)=sE-F\) we now consider perturbations of the form

$$\begin{aligned} Q(s)=w(su^*-v^*), \end{aligned}$$
(7.1)

where \(u,v,w\in \mathbb {C}^d\), \((u,v)\ne (0,0)\) and \(w\ne 0\). These are rank-one matrix pencils. Recall that the rank of a matrix pencil Q is the largest \(r\in {\mathbb {N}}\) such that Q, viewed as a matrix with polynomial entries, has minors of size r that are not identically zero [14, 20]. Then, P and \(P+Q\) are rank-one perturbations of each other, in the sense that they differ by (at most) a rank-one matrix pencil.

Lemma 7.6

Given \(P(s)=sE-F\), let Q be a rank-one matrix pencil as in (7.1). Then, the linear relations

$$\begin{aligned} E^{-1}F \quad \text{ and } \quad \left( E+wu^*\right) ^{-1}(F+wv^*) \end{aligned}$$

either coincide or they are one-dimensional perturbations of each other in the sense of Definition 6.1.

Proof

Obviously, for \({\mathcal {M}}:= E^{-1}F \cap \left( E+wu^*\right) ^{-1}(F+wv^*)\) we have

$$\begin{aligned} {\mathcal {M}} = \left\{ \{x,y\} \in {\mathbb {C}}^d\times {\mathbb {C}}^d : Fx=Ey\; \text{ and } \; (F+wv^*)x=(E+wu^*)y \right\} . \end{aligned}$$

That is,

$$\begin{aligned} {\mathcal {M}} = E^{-1}F \cap \{v, -u\}^\bot = \left( E+wu^*\right) ^{-1}(F+wv^*) \cap \{v, -u\}^\bot . \end{aligned}$$

This implies

$$\begin{aligned} \dim \frac{E^{-1}F}{{\mathcal {M}}}\le 1 \qquad \text{ and }\qquad \dim \frac{\left( E+wu^*\right) ^{-1}(F+wv^*)}{{\mathcal {M}}}\le 1, \end{aligned}$$

which proves the claim. \(\square \)

Matrix pencils as in (7.1) do not cover the set of all rank-one matrix pencils in \({\mathbb {C}}^d\). The remaining rank-one matrix pencils can be written as

$$\begin{aligned} Q(s)=(su-v)w^*, \end{aligned}$$
(7.2)

where \(u,v,w\in {\mathbb {C}}^d\) are such that \((u,v)\ne (0,0)\) and \(w\ne 0\). Given \(P(s)=sE-F\) and a rank-one pencil Q of the form (7.2), the associated linear relations \(E^{-1}F\) and \((E+uw^*)^{-1}(F+vw^*)\) can be two-dimensional perturbations of each other. Hence, the statements in Lemma 7.6 are not valid for rank-one matrix pencils of the form (7.2).

On the other hand, the linear relations \(FE^{-1}\) and \((F+vw^*)(E+uw^*)^{-1}\) are (at most) one-dimensional perturbations of each other in the sense of Definition 6.1. A deeper analysis of the correspondence between matrix pencils and their representing linear relations will be provided in the forthcoming manuscript [22], where the Segre and Weyr characteristics for linear relations are introduced. The results will then give rise to sharp estimates on similar quantities as above for (all) one-dimensional perturbations.

Remark 7.7

Applying Lemma 7.6 to the dual matrix pencils \({{\,\mathrm{rev}\,}}P\) and \({{\,\mathrm{rev}\,}}Q\), it follows that

$$\begin{aligned} F^{-1}E \ \ \ \text {and} \ \ \ (F + wv^*)^{-1}(E+wu^*) \end{aligned}$$

either coincide or they are one-dimensional perturbations of each other in the sense of Definition 6.1.

The following theorem is the second main result of this article. We consider here all possible situations of regular/singular matrix pencils P and \(P+Q\). Recall that a matrix pencil \(P(s)=sE-F\) is called regular if \(\det (sE-F)\) is not identically zero. Otherwise, P is called singular.

Theorem 7.8

Given \(P(s)=sE-F\), let Q be a rank-one matrix pencil as in (7.1). For \(\lambda \in \overline{{\mathbb {C}}}\) and \(n\in {\mathbb {N}}\cup \{0\}\), the following statements hold:

  1. (i)

    If both pencils P and \(P+Q\) are regular, then

    $$\begin{aligned} \left| \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P+Q)}{\mathcal {L}_{\lambda }^n(P+Q)}- \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P)}{\mathcal {L}_{\lambda }^{n}(P)}\right| \le 1. \end{aligned}$$
  2. (ii)

    If P is regular but \(P+Q\) is singular, then

    $$\begin{aligned} -1-n \le \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P+Q)}{\mathcal {L}_{\lambda }^n(P+Q)}- \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P)}{\mathcal {L}_{\lambda }^{n}(P)} \le 1. \end{aligned}$$
  3. (iii)

    If P is singular and \(P+Q\) is regular, then

    $$\begin{aligned} -1 \le \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P+Q)}{\mathcal {L}_{\lambda }^n(P+Q)}- \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P)}{\mathcal {L}_{\lambda }^{n}(P)} \le n+1. \end{aligned}$$
  4. (iv)

    If both P and \(P+Q\) are singular, then

    $$\begin{aligned} \left| \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P+Q)}{\mathcal {L}_{\lambda }^n(P+Q)}- \dim \frac{\mathcal {L}_{\lambda }^{n+1}(P)}{\mathcal {L}_{\lambda }^{n}(P)}\right| \le n+1. \end{aligned}$$

Proof

According to Lemma 7.3, if \(\lambda \in {\mathbb {C}}\) we may assume \(\lambda = 0\). By Proposition 7.4, for \(n\in {\mathbb {N}}\cup \{0\}\) we have that

$$\begin{aligned} \mathcal {L}_{0}^{n}(P) = N\big ((E^{-1}F)^n\big ) \quad \text{ and } \quad \mathcal {L}_{0}^{n}(P+Q) = N(B^n), \end{aligned}$$

where \(B:=(E+wu^*)^{-1}(F+wv^*)\). Due to Lemma 7.6 the linear relations \(E^{-1}F\) and B are (at most) one-dimensional perturbations of each other and, by Theorem 4.5,

$$\begin{aligned} -1-s_{n}(B,E^{-1}F)\le \,\dim \frac{\mathcal {L}_{0}^{n+1}(P+Q)}{\mathcal {L}_{0}^n(P+Q)}- \dim \frac{\mathcal {L}_{0}^{n+1}(P)}{\mathcal {L}_{0}^{n}(P)}\le 1 + s_n(E^{-1}F,B). \end{aligned}$$

Then, Proposition 4.3 implies statement (iv). If the pencil P is regular then, by definition, not every complex number is an eigenvalue of P. Hence, by Proposition 7.4, those numbers are neither eigenvalues of \(E^{-1}F\). From [41] it follows that, in this case, \(E^{-1}F\) has no singular chains and we conclude that

$$\begin{aligned} s_n(E^{-1}F,B)=0, \end{aligned}$$

see (4.1). Similarly, if \(P+Q\) is regular we obtain \(s_{n}(B,E^{-1}F)=0\), which shows the remaining statements (i)–(iii).

For \(\lambda =\infty \) similar arguments can be used using \(F^{-1}E\) and \(C:=(F+wv^*)^{-1}(E+wu^*)\) instead of \(E^{-1}F\) and B, see Corollary 7.5 and Remark 7.7. \(\square \)

Note that the estimate in item (i) of Theorem 7.8 is already known. It was shown in [14, Lemma 2.1] with the help of a result for polynomials, see also [45, Theorem 1]. The remaining estimates in Theorem 7.8 are new.

Example 7.9

In this and the following section we focus on matrix pencils, but most of the statements remain true if we consider operator pencils of the form

$$\begin{aligned} Z(s) := sE-F, \end{aligned}$$

where E and F are linear and bounded operators in some Hilbert space X. If E and F are compact operators, then Z(s) is a Keldysh pencil, see [30].

Assume that E and F are bounded operators. One defines eigenvalues and Jordan chains as in Definition 7.1 and it is easily seen that also Lemma 7.3, Proposition 7.4, Corollary 7.5 and Lemma 7.6 hold, as they are based on algebraic properties only, where Q(s) for some vectors uvw in X with \((u,v)\ne (0,0)\) and \(w\ne 0\) is defined as

$$\begin{aligned} Q(s)x = w(s\langle x,u\rangle -\langle x,v\rangle ), \qquad x\in X. \end{aligned}$$

Here \(\langle \cdot \,,\cdot \rangle \) stands for the Hilbert space scalar product in X. Then a straight-forward application of Theorem 4.5 (see also Theorem 7.8) gives for \(\lambda \in {\mathbb {C}}\)

$$\begin{aligned} \left| \dim \frac{\mathcal {L}_{\lambda }^{n+1}(Z+Q)}{\mathcal {L}_{\lambda }^n(Z+Q)}- \dim \frac{\mathcal {L}_{\lambda }^{n+1}(Z)}{\mathcal {L}_{\lambda }^{n}(Z)}\right| \le n+1, \end{aligned}$$
(7.3)

if \(\frac{\mathcal {L}_{\lambda }^{n+1}(Z)}{\mathcal {L}_{\lambda }^{n}(Z)}\) is of finite dimension. The estimate in (7.3) seems to be new for operator pencils. Moreover, in this setting also essential spectrum may exist. We are not going into details here, but the above setting also allows to treat the essential spectrum. We refer to [23] for related considerations.

Remark 7.10

In the following we present estimates for so-called Wong sequences, which have their origin in [46]. Recently, Wong sequences have been used to prove the Kronecker canonical form, see [8,9,10]. For \(E,F\in {\mathbb {C}}^{d\times d}\) the Wong sequence of the second kind of the pencil \(P(s) := sE-F\) is defined as the sequence of subspaces \(({\mathcal {W}}_i(P))_{i\in {\mathbb {N}}}\) given by

$$\begin{aligned} {\mathcal {W}}_0(P)&= \{0\},\quad&{\mathcal {W}}_{i+1}(P)&= \left\{ x\in {\mathbb {C}}^d : Ex\in F{\mathcal {W}}_i(P)\right\} ,\quad i\in {\mathbb {N}}\cup \{0\}. \end{aligned}$$

It is easily seen by induction that for \(n\in {\mathbb {N}}\) we have

$$\begin{aligned} {\mathcal {W}}_n(P) = N\big ((F^{-1}E)^n\big ). \end{aligned}$$

Theorem 4.5 now yields the following statements on the behavior of the Wong sequences of the second kind under rank-one perturbations of the type (7.1):

  1. (i)

    If both pencils P and \(P+Q\) are regular, then

    $$\begin{aligned} \left| \dim \frac{{\mathcal {W}}_{n+1}(P+Q)}{\mathcal {W}_n(P+Q)}- \dim \frac{\mathcal {W}_{n+1}(P)}{\mathcal {W}_{n}(P)}\right| \le 1. \end{aligned}$$
  2. (ii)

    If P is regular but \(P+Q\) is singular, then

    $$\begin{aligned} -1-n \le \dim \frac{\mathcal {W}_{n+1}(P+Q)}{\mathcal {W}_n(P+Q)}- \dim \frac{\mathcal {W}_{n+1}(P)}{\mathcal {W}_{n}(P)} \le 1. \end{aligned}$$
  3. (iii)

    If P is singular and \(P+Q\) is regular, then

    $$\begin{aligned} -1 \le \dim \frac{\mathcal {W}_{n+1}(P+Q)}{\mathcal {W}_n(P+Q)}- \dim \frac{\mathcal {W}_{n+1}(P)}{\mathcal {W}_{n}(P)} \le n+1. \end{aligned}$$
  4. (iv)

    If both P and \(P+Q\) are singular, then

    $$\begin{aligned} \left| \dim \frac{\mathcal {W}_{n+1}(P+Q)}{\mathcal {W}_n(P+Q)}- \dim \frac{\mathcal {W}_{n+1}(P)}{\mathcal {W}_{n}(P)}\right| \le n+1. \end{aligned}$$

8 Perturbations of the Kronecker Canonical Form

Recall that every pencil \(P(s)=sE-F\) can be transformed into the Kronecker canonical form, see e.g. [9, 10, 20]. To introduce this form, define for \(k\in {\mathbb {N}}\) the matrices

$$\begin{aligned} N_k:=\left[ \begin{array}{cccc} 0&{}&{}&{}\\ 1&{}0&{}&{}\\ &{}\ddots &{}\ddots &{}\\ &{}&{}1&{}0 \end{array} \right] \in {\mathbb {C}}^{k\times k}, \end{aligned}$$

and for a multi-index \(\alpha =(\alpha _1,\ldots ,\alpha _l)\in {\mathbb {N}}^l\), \(l\ge 1\), with absolute value \(|\alpha |=\sum _{i=1}^l\alpha _i\) let

$$\begin{aligned} N_{\alpha }:={\text {diag}}(N_{\alpha _1},\ldots ,N_{\alpha _l})\in {\mathbb {C}}^{|\alpha |\times |\alpha |}. \end{aligned}$$

If \(k\ge 1\), the following rectangular matrices are defined as

$$\begin{aligned} K_k:=\left[ \begin{array}{cccc} 1 &{} 0 &{}&{}\\ &{} \ddots &{} \ddots &{}\\ &{}&{}1&{}0 \end{array} \right] ,\quad L_k:=\left[ \begin{array}{cccc} 0 &{} 1 &{}&{}\\ &{} \ddots &{} \ddots &{}\\ &{}&{}0&{}1 \end{array} \right] \in {\mathbb {C}}^{k\times (k+1)}, \end{aligned}$$

and, if \(k=0\),

$$\begin{aligned} K_0 = L_0 := 0_{0\times 1}. \end{aligned}$$

If \(E,F\in {\mathbb {C}}^{d\times d}\), the expression \(0_{0\times 1}\) means that there is a 0-column \((0,\ldots ,0)^\top \in {\mathbb {C}}^{d\times 1}\) in the matrix (8.1) below, and \(0_{0\times 1}^\top \) means that there is a 0-row \((0,\ldots ,0)\in {\mathbb {C}}^{1\times d}\) in (8.1) at the corresponding block. The notation \(0_{0\times 1}\) indicates that there is no contribution to the number of rows in (8.1), whereas \(0_{0\times 1}^\top \) gives no contribution to the number of columns. For a multi-index \(\varepsilon =(\varepsilon _1,\ldots ,\varepsilon _l)\in ({\mathbb {N}}\cup \{0\})^l\) we define

$$\begin{aligned} K_{\varepsilon }:={\text {diag}}(K_{\varepsilon _1},\ldots ,K_{\varepsilon _l}),~ L_{\varepsilon }:={\text {diag}}(L_{\varepsilon _1},\ldots ,L_{\varepsilon _l})\in {\mathbb {C}}^{|\varepsilon |\times (|\varepsilon |+l)}. \end{aligned}$$

According to Kronecker [31], there exist invertible matrices \(S,T\in {\mathbb {C}}^{d\times d}\) such that \(S(sE-F)T\) has a block diagonal form

$$\begin{aligned} \begin{bmatrix} sI_{n_0}-A_0&{} 0&{}0&{}0 \\ 0&{} sN_\alpha -I_{|\alpha |}&{}0&{}0\\ 0&{}0&{} sK_{\varepsilon }-L_{\varepsilon } &{}0\\ 0&{}0&{}0&{} sK_{\eta }^\top -L_{\eta }^\top \end{bmatrix} \end{aligned}$$
(8.1)

for some \(A_0\in {\mathbb {C}}^{n_0\times n_0}\) in Jordan canonical form, which is unique up to a permutation of its Jordan blocks, and multi-indices \(\alpha \in {\mathbb {N}}^{n_\alpha }\), \(\varepsilon \in ({\mathbb {N}}\cup \{0\})^{n_\varepsilon }\), \(\eta \in ({\mathbb {N}}\cup \{0\})^{n_{\eta }}\) which are unique up to a permutation of their entries, see also [20, Chapter XII] or [32]. Let \((\sigma _1(\lambda ),\ldots ,\sigma _r(\lambda ))\) denote the sizes of the Jordan blocks in a non-increasing order associated to an eigenvalue \(\lambda \) of \(A_0\). These numbers are also called the Segre characteristic of the eigenvalue \(\lambda \) of \(A_0\). The numbers \(\alpha _i\), \(i=1,\ldots ,n_\alpha \) are called the infinite elementary divisors of P(s), the numbers \(\varepsilon _i\), \(i=1,\ldots ,n_\varepsilon \) are called the column minimal indices of P(s), and the numbers \(\eta _j\), \(j=1,\dots ,n_\eta \) are known as the row minimal indices of P(s), see e.g. [13, 20]. It is assumed that they are indexed in non-increasing order, i.e.

$$\begin{aligned} \alpha _1\ge \cdots \ge \alpha _{n_\alpha }\ge 1, \quad \varepsilon _1\ge \cdots \ge \varepsilon _{n_\varepsilon }\ge 0 \quad \text {and} \quad \eta _1\ge \cdots \ge \eta _{n_\eta }\ge 0. \end{aligned}$$
(8.2)

The sequences of numbers in (8.2) are also called the Segre characteristics of the infinite elementary divisors, the column minimal indices and the row minimal indices of the pencil P(s). Note that the Segre characteristic in [13] was defined in a slightly different way, namely without the numbers stemming from the minimal indices.

For \(\lambda \in {\mathbb {C}}\) the Weyr characteristic of \(A_0\) is defined for each \(j\in {\mathbb {N}}\) as

$$\begin{aligned} w_j(\lambda )=\#\{i: \sigma _i(\lambda )\ge j\},~ j=1,\ldots ,\sigma _1(\lambda ), \quad w_j(\lambda )=0, ~j>\sigma _1(\lambda ), \end{aligned}$$
(8.3)

i.e., \(w_j(\lambda )\) is the number of Jordan blocks of size at least j of the eigenvalue \(\lambda \) of \(A_0\). If \(\lambda \) is not an eigenvalue of \(A_0\) we define \(w_j(\lambda )=0, ~j\in {\mathbb {N}}\). Note that

$$\begin{aligned} w_j(\lambda )=\dim \frac{N((A_0-\lambda )^j)}{N((A_0-\lambda )^{j-1})}. \end{aligned}$$

In the same way, the Weyr characteristics of the infinite elementary divisors, the column minimal indices and the row minimal indices are defined as the conjugate partitions of \(\alpha \), of \(\varepsilon \), and of \(\eta \). E.g., if \(\varepsilon _1\ge \cdots \ge \varepsilon _{n_\varepsilon }\ge 0\) are the column minimal indices of P(s), then

$$\begin{aligned} \Delta _j:=\#\{i:\ \varepsilon _i\ge j\}, \qquad j=0,\ldots ,\varepsilon _{1}, \quad \Delta _j=0,~j>\varepsilon _1, \end{aligned}$$
(8.4)

is the Weyr characteristic of the column minimal indices of P(s) i.e. \(\Delta _j\) is the number of column minimal indices of P(s) which are larger than or equal to j. The finite sequences \((\Delta _1, \ldots , \Delta _{\varepsilon _1})\) and \((\varepsilon _1,\ldots , \varepsilon _{n_\varepsilon })\) are conjugate partitions of \(|\varepsilon |\). Note that the Segre characteristics can be easily derived from the Weyr characteristics. For a detailed exposition of the Weyr characteristic of matrices we refer to [43].

If the Kronecker canonical form of P is given by (8.1), then

$$\begin{aligned} {\text {rank}}(P)=d-n_\varepsilon =d-n_\eta , \end{aligned}$$

i.e. the rank of a pencil is related to the number of column and row minimal indices of P. In what follows we will investigate the behavior of the Kronecker canonical form under perturbations. In [13] the unperturbed pencil P has no full rank, and the perturbation Q is a pencil such that

$$\begin{aligned} {\text {rank}}(P+Q)={\text {rank}}(P) + {\text {rank}}(Q). \end{aligned}$$

This set of perturbations is generic in the sense that it is open and dense in the set of pencils with given size and rank. For such perturbations Q it is shown that the number and the dimensions of the Jordan blocks associated to an eigenvalue increase under perturbations of the above form.

The Theorem 7.8 can be interpreted in terms of the Kronecker invariants.

Theorem 8.1

Given a matrix pencil \(P(s)=sE-F\) in Kronecker canonical form (8.1), assume that \(\lambda \in {\mathbb {C}}\) is an eigenvalue of P. Then

$$\begin{aligned} {\mathcal {L}}_{\lambda }^j(P) = N\big ((A_0-\lambda )^j\big ) \oplus \{0\} \oplus N\big ((K_\varepsilon ^{-1}L_\varepsilon -\lambda )^{j}\big )\oplus \{0\}, \end{aligned}$$
(8.5)

where the first \(\{0\}\) in (8.5) is in \(\mathbb {C}^{|\alpha |}\) and the last \(\{0\}\) is in \(\mathbb C^{|\eta |}\).

Note that in Theorem 8.1\(K_\varepsilon ^{-1}L_\varepsilon \) has to be interpreted as a linear relation.

Proof of Theorem 8.1

First, assume that \((x_n, \ldots ,x_{0})\) in \({\mathbb {C}}^d\) is a Jordan chain at \(\lambda \in {\mathbb {C}}\) for the matrix pencil P. According to Lemma 7.3 it is no restriction to assume \(\lambda =0\). Since \(N_\alpha \) is in block diagonal form it is assumed without restriction that \(\alpha \) has only one entry. Let \(\varepsilon \) and \(\eta \) be multi-indices with k and l zeros. We decompose the vectors \(x_n, \ldots ,x_{0}\) according to the decomposition \({\mathbb {C}}^d={\mathbb {C}}^{n_0+\alpha + (|\varepsilon |+n_\varepsilon ) +|\eta |}= {\mathbb {C}}^{n_0}\oplus {\mathbb {C}}^{\alpha }\oplus \mathbb C^{|\varepsilon |+n_\varepsilon }\oplus {\mathbb {C}}^{|\eta |}\) corresponding to the Kronecker canonical form,

$$\begin{aligned} x_j=(x_{j,1},\ x_{j,2},\ x_{j,3},\ x_{j,4})^\top \in \mathbb C^{n_0+\alpha +(|\varepsilon |+n_\varepsilon )+|\eta |} \quad \text{ for } j=0, \ldots , n. \end{aligned}$$
(8.6)

Consider the third entry of (8.6). By (8.2), \(\varepsilon \) has the form

$$\begin{aligned} \varepsilon =(\varepsilon _{1},\ldots , \varepsilon _{n_\varepsilon -k},0, \ldots , 0), \end{aligned}$$

where \(\varepsilon _j\ge 1\) for \(j=1,\ldots , \varepsilon _{n_\varepsilon -k}\). Then \(K_\varepsilon \) and \(L_\varepsilon \) are of the form

$$\begin{aligned} K_\varepsilon =\left[ \begin{array}{ccccccc} K_{\varepsilon _{1}} &{} &{} &{} &{} 0 &{} \cdots &{} 0\\ &{} K_{\varepsilon _{2}}&{} &{} &{}\vdots &{}&{}\vdots \\ &{}&{}\ddots &{} &{}\vdots &{}&{}\vdots \\ &{} &{} &{} K_{\varepsilon _{n_\varepsilon -k}}&{} 0 &{} \cdots &{} 0 \end{array} \right] \end{aligned}$$

and

$$\begin{aligned} L_\varepsilon =\left[ \begin{array}{ccccccc} L_{\varepsilon _{1}} &{} &{} &{} &{} 0 &{} \cdots &{} 0\\ &{} L_{\varepsilon _{2}}&{} &{} &{}\vdots &{}&{}\vdots \\ &{}&{}\ddots &{} &{}\vdots &{}&{}\vdots \\ &{} &{} &{} L_{\varepsilon _{n_\varepsilon -k}}&{} 0 &{} \cdots &{} 0 \end{array} \right] \end{aligned}$$

where the last k columns in \(K_\varepsilon \) and in \(L_\varepsilon \) consist of zeros only. Hence, for \(i\in {\mathbb {N}}\),

$$\begin{aligned} N\big ((K_\varepsilon ^{-1}L_\varepsilon )^{i}\big ) = \left( \bigoplus _{j=1}^{n_\varepsilon -k} N\big ((K_{\varepsilon _j}^{-1}L_{\varepsilon _j})^{i}\big )\right) \oplus \mathbb C^k, \end{aligned}$$

and for the third entry of (8.6) one finds

$$\begin{aligned} x_{j,3} \in N\big ((K_\varepsilon ^{-1}L_\varepsilon )^{j+1}\big ) \qquad \text {for } j=0,\ldots ,n. \end{aligned}$$

This shows that it is sufficient to consider the case that \(n_\varepsilon =1\),

Now the fourth entry of (8.6) is considered. By (8.2), \(\eta \) has the form

$$\begin{aligned} \eta =(\eta _{1},\ldots , \eta _{n_\eta -l},0, \ldots , 0), \end{aligned}$$

where \(\eta _j\ge 1\) for \(j=1,\ldots , n_\eta -l\). Thus \(K_\eta ^\top \) and \(L_\eta ^\top \) are of the form

$$\begin{aligned} K_\eta ^\top =\left[ \begin{array}{ccc} K_{\eta _{1}}^\top &{}&{}\\ &{}\ddots &{}\\ &{}&{} K_{\eta _{n_\eta -l}}^\top \\ 0&{}\hdots &{} 0\\ \vdots &{} &{} \vdots \\ 0&{}\hdots &{} 0 \end{array} \right] ,\quad L_\eta ^\top =\left[ \begin{array}{ccc} L_{\eta _{1}}^\top &{}&{}\\ &{}\ddots &{}\\ &{}&{} L_{\eta _{n_\eta -l}}^\top \\ 0&{}\hdots &{} 0\\ \vdots &{} &{} \vdots \\ 0&{}\hdots &{} 0 \end{array} \right] , \end{aligned}$$

where the last l rows in \(K_\eta ^\top \) and in \(L_\eta ^\top \) consist of zeros only. In order to show that the vectors \(x_{0,4}, \ldots , x_{n,4}\) are zero, it remains to consider the case that \(n_\eta =1\).

The considerations above have shown that we can restrict us to the case that \(n_\alpha =n_\varepsilon =n_\eta =1\). Hence, in (8.1) we have \(\alpha , \varepsilon , \eta \in {\mathbb {N}}\) with

$$\begin{aligned} \alpha \ge 1,\quad \varepsilon \ge 1, \text{ and } \eta \ge 1. \end{aligned}$$

Since \((x_n, \ldots ,x_{0})\) is a Jordan chain of P at \(\lambda =0\), the following equations are satisfied for \(j=1,\ldots ,n\):

$$\begin{aligned} A_0x_{0,1}=0,&A_0x_{j,1}=x_{j-1,1}, \end{aligned}$$
(8.7)
$$\begin{aligned} I_{\alpha }x_{0,2}=0,&I_{\alpha }x_{j,2}= N_\alpha x_{j-1,2}, \end{aligned}$$
(8.8)
$$\begin{aligned} L_\varepsilon x_{0,3}=0,&L_\varepsilon x_{j,3}= K_\varepsilon x_{j-1,3}, \end{aligned}$$
(8.9)
$$\begin{aligned} L_\eta ^\top x_{0,4}=0,&L_\eta ^\top x_{j,4}= K_\eta ^\top x_{j-1,4}. \end{aligned}$$
(8.10)

Thus, by (8.8), the vectors \(x_{0,2}, \ldots , x_{n,2}\) are zero. Similarly, by (8.10), the vectors \(x_{0,4}, \ldots , x_{n,4}\) are zero. Equation (8.7) shows that \((x_{n,1}, \ldots ,x_{0,1})\) is a Jordan chain at zero for the matrix \(A_0\). Finally, (8.9) for \(j=0,\ldots , n\) gives

$$\begin{aligned} x_{j,3} \in N\big ((K_\varepsilon ^{-1}L_\varepsilon )^{j+1}\big ). \end{aligned}$$

This shows that every vector in the chain \((x_n, \ldots ,x_{0})\) is an element in the right hand-side of (8.5). Therefore, \({\mathcal {L}}_\lambda ^n(P)\) is contained in the right hand-side of (8.5).

Conversely, if \(x_n\) is an element in the right hand-side of (8.5) for \(j=n+1\), and decomposing \(x_n\) as in (8.6), we have \(x_n=(x_{n,1},\ 0,\ x_{n,3},\ 0)^\top \) with

$$\begin{aligned} x_{n,1} \in N\big (A_0^{n+1}\big ) \quad \text{ and } \quad x_{n,3} \in N\big ((K_\varepsilon ^{-1}L_\varepsilon )^{n+1}\big ). \end{aligned}$$

Therefore, for each \(i=0,\ldots ,n-1\) there exist vectors \(x_{i,1}\) and \(x_{i,3}\) which satisfy equations (8.7) and (8.9). For \(i=0,\ldots , n-1\), set

$$\begin{aligned} x_i:=(x_{i,1},\ 0,\ x_{i,3},\ 0)^\top . \end{aligned}$$

From this, it is easy to see that \((x_n, \ldots ,x_{0})\) is a Jordan chain (at \(\lambda =0\)) for the matrix pencil P. In particular, \(x_n\in {\mathcal {L}}_\lambda ^{n+1}(P)\). \(\square \)

Using the above result, we present an alternative version of Theorem 7.8 in terms of the Weyr characteristics of the Kronecker canonical form. For simplicity, we state it here only for finite eigenvalues \(\lambda \). A similar statement can be shown for \(\lambda = \infty \) applying Corollary 7.5.

Theorem 8.2

Let \(\lambda \in {\mathbb {C}}\) and \(n\in {\mathbb {N}}\cup \{0\}\). Given \(P(s)=sE-F\) in \({\mathbb {C}}^{d\times d}\), let Q be a rank-one matrix pencil as in (7.1). Assume that \(A_0\) and \(\widetilde{A_0}\) are the matrices in Jordan canonical form appearing in the Kronecker canonical forms (8.1) of P and \(P+Q\), denote by \(w_n(\lambda )\) and \({\widetilde{w}}_n(\lambda )\) the Weyr characteristics of the matrices \(A_0\) and \({\widetilde{A}}_0\), according to (8.3) and by \(\Delta _n\) and \({\widetilde{\Delta }}_n\) the Weyr characteristics of the column minimal indices of P and \(P+Q\) according to (8.4). Then the following statements hold:

  1. (i)

    If both pencils P and \(P+Q\) are regular, then \(\Delta _n={\widetilde{\Delta }}_n=0\) and

    $$\begin{aligned} \left| {\widetilde{w}}_{n+1}(\lambda )-w_{n+1}(\lambda )\right| \le 1. \end{aligned}$$
  2. (ii)

    If P is regular and \(P+Q\) is singular, then \(\Delta _n=0\) and

    $$\begin{aligned} -1-n \le {\widetilde{w}}_{n+1}(\lambda )-w_{n+1}(\lambda ) \le 1 + {\widetilde{\Delta }}_n. \end{aligned}$$
  3. (iii)

    If P is singular and \(P+Q\) is regular, then \({\widetilde{\Delta }}_n=0\) and

    $$\begin{aligned} -1 -\Delta _n \le {\widetilde{w}}_{n+1}(\lambda )-w_{n+1}(\lambda )\le n+1. \end{aligned}$$
  4. (iv)

    If both P and \(P+Q\) are singular, then

    $$\begin{aligned} \left| {\widetilde{w}}_{n+1}(\lambda )-w_{n+1}(\lambda ) + {\widetilde{\Delta }}_n -\Delta _n \right| \le n+1. \end{aligned}$$

Proof

Note that if S and T are invertible matrices and \((x_n,\ldots , x_0)\) is a Jordan chain of some pencil P(s), the Definition 7.1 immediately implies that \((T^{-1}x_n,\ldots , T^{-1}x_0)\) is a Jordan chain of the pencil \({\hat{P}}(s)=SP(s)T\). Hence \(\dim {\mathcal {L}_{\lambda }^n(P)}=\dim {\mathcal {L}_{\lambda }^n({\hat{P}})}\) for all \(n\in {\mathbb {N}}\cup \{0\}\). According to Lemma 7.3, if \(\lambda \in {\mathbb {C}}\) we may assume \(\lambda = 0\). As a consequence of Theorem 8.1,

$$\begin{aligned} \dim \frac{\mathcal {L}_{0}^{n+1}(P)}{\mathcal {L}_{0}^n(P)}= \dim \frac{N (A_0^{n+1})}{N(A_0^n)} +\dim \frac{N(K_\varepsilon ^{-1}L_\varepsilon )^{n+1}}{N(K_\varepsilon ^{-1}L_\varepsilon )^{n}}, \end{aligned}$$

and it is straightforward to see that

$$\begin{aligned} \dim \frac{N(K_\varepsilon ^{-1}L_\varepsilon )^{n+1}}{N(K_\varepsilon ^{-1}L_\varepsilon )^{n}} = \Delta _n. \end{aligned}$$

The same holds for the pencil \(P+Q\), and we obtain

$$\begin{aligned} \dim \frac{\mathcal {L}_{0}^{n+1}(P+Q)}{\mathcal {L}_{0}^n(P+Q)}= \dim \frac{N ({\widetilde{A}}_0^{n+1})}{N({\widetilde{A}}_0^n)} +{\widetilde{\Delta }}_n. \end{aligned}$$

Then, the result follows immediately from Theorem 7.8. \(\square \)

Finally, we compare the above result with Section 4 in [13]. A particular case of Lemma 4.2 in [13] can be restated in the following way. Given a matrix pencil P(s) in \({\mathbb {C}}^{d\times d}\) with \({\text {rank}}(P)=r\), assume that \(\lambda \) is an eigenvalue of P with partial multiplicities \(0\le m_1\le \ldots \le m_r\). Let Q(s) be a matrix pencil in \({\mathbb {C}}^{d\times d}\) with \({\text {rank}}(Q)=1\) and let m be the partial multiplicity of \(\lambda \) relative to Q (m can also be zero). If \({\text {rank}}(P+Q)=r+1\) and \(m_i<m\le m_{i+1}\) for some \(i=0,1,\ldots ,r\) (where \(m_0=-1\) and \(m_{r+1}=\infty )\), then the partial multiplicities \(0\le m'_1\le \ldots \le m'_{r+1}\) of \(\lambda \) relative to \(P+Q\) satisfy

$$\begin{aligned} m'_1=m_1,\ \ldots ,\ m'_i=m_i, \quad m'_{i+1}\ge m, \quad m'_{i+2}\ge m_{i+1},\ \ldots ,\ m'_{r+1}\ge m_r. \end{aligned}$$
(8.11)

Given matrix pencils P and Q as in Theorem 8.2, in order to satisfy the conditions of item (iii) it is necessary that \({\text {rank}}(P)=d-1\) and \({\text {rank}}(P+Q)=d\). Hence, the hypothesis \({\text {rank}}(P+Q)={\text {rank}}(P) + {\text {rank}}(Q)\) is fulfilled and Lemma 4.2 in [13] provides a better estimate. Reversing the roles of P and \(P+Q\), the same happens with item (ii).

However, if both P and \(P+Q\) are singular the result in item (iv) holds, independently of the hypothesis \({\text {rank}}(P+Q)={\text {rank}}(P) + {\text {rank}}(Q)\). Therefore, Theorem 8.2 gives new information in case that \({\text {rank}}(P+Q)\ne {\text {rank}}(P) + {\text {rank}}(Q)\).