Improved spectral convergence rates for graph Laplacians on ε-graphs and k-NN graphs

https://doi.org/10.1016/j.acha.2022.02.004Get rights and content

Abstract

In this paper we improve the spectral convergence rates for graph-based approximations of weighted Laplace-Beltrami operators constructed from random data. We utilize regularity of the continuum eigenfunctions and strong pointwise consistency results to prove that spectral convergence rates are the same as the pointwise consistency rates for graph Laplacians. In particular, for an optimal choice of the graph connectivity ε, our results show that the eigenvalues and eigenvectors of the graph Laplacian converge to those of a weighted Laplace-Beltrami operator at a rate of O(n1/(m+4)), up to log factors, where m is the manifold dimension and n is the number of vertices in the graph. Our approach is general and allows us to analyze a large variety of graph constructions that include ε-graphs and k-NN graphs. We also present the results of numerical experiments analyzing convergence rates on the two dimensional sphere.

Introduction

Our work is motivated by applications in machine learning, statistics and artificial intelligence. There, the goal is to learn structure from a given data set X={x1,,xn}. To do this several authors have proposed the use of graphs to endow data sets with some geometric structure, and have utilized graph Laplacians to understand how information propagates on the graph representing the data. Graph Laplacians and their spectra form the basis of algorithms for supervised learning [1], [4], [48], [56], clustering [41], [51] and dimensionality reduction [2], [16]. The works [43], [49], [53] discuss Laplacian regularization in the context of non-parametric regression. Bayesian approaches to learning where graph Laplacians are used to define covariance matrices for Gaussian priors have been proposed in [5], [36], [57].

To better understand algorithms based on graph Laplacians, it has proven useful to study the large sample size asymptotics of graph Laplacians when these capture the closeness of data points in Euclidean space, as is the case in constructions such as ε-graphs or k-NN graphs. In this limit, we pass from discrete graph Laplacians to continuum Laplace-Beltrami operators, or weighted versions thereof, and in particular graph Laplacians are seen as specific discretizations of continuum operators. By analyzing the passage to the limit one effectively studies the consistency of algorithms that utilize said operators. In doing so, one gathers information about allowed choice of parameters, and gets insights about computational stability of algorithms (e.g. [24], [34], [35]). Naturally, in order for the “passage to the continuum” to imply any sort of consistency for a particular machine learning algorithm, it is important to study the convergence in an appropriate sense.

Early work on consistency of graph Laplacians focused on pointwise consistency results for ε-graphs (see, for example, [3], [30], [32], [33], [45], [50]). There, as well as hereinafter, the data is assumed to be an i.i.d. sample of size n from a ground truth measure μ supported on an m-dimensional submanifold M embedded in a high dimensional Euclidean space Rd (i.e., the manifold assumption [14]) and pairs of points that are within distance ε of each other are given high weights. Pointwise consistency results show that as n and the connectivity parameter ε0 (at a slow enough rate), the graph Laplacian applied to a fixed smooth test function converges to a continuum operator, such as a weighted Laplace-Beltrami operator applied to the test function. Recent work is moving beyond pointwise consistency and studying the sequence of solutions to graph-based learning problems and their continuum limits, using tools like Γ-convergence [11], [18], [28], [47], tools from PDE theory [8], [9], [11], [19], [25], [54] including the maximum principle and viscosity solutions, and more recently random walk and Martingale methods [12]. Regarding spectral convergence of graph Laplacians, the regime n and ε constant was studied in [52], and in [46] which analyzes connection Laplacians. Works that have studied regimes where ε is allowed to decay to zero include [27], [44], [7], and [22].

The starting point for our work is the paper [22] which used ideas from [7] in order to obtain what to the best of our knowledge are the state of the art results on spectral convergence of ε-graph Laplacians. These results can be summarized as follows. With very high probability the error of approximation of eigenvalues of a continuum elliptic differential operator by the eigenvalues of their graph Laplacian counterpart scales likeε+log(n)pmn1/mε, where pm=1/m for m3, p2=3/4, and ε is the length scale for the graph construction. These results suggested that the best rate of convergence is achieved when ε is chosen to scale like log(n)pmn1/m, in which case the convergence rate for eigenvalues is O(n1/2m), up to log factors. For eigenvectors, the error of approximation in the L2 norm was shown to scale like the square root of the convergence rate of eigenvalues, so O(n1/4m) up to log factors. In this paper, we improve in several regards the results presented in [22]. Our contributions to the analysis of spectral convergence of graph Laplacians constructed from ε-graphs are summarized as follows:

  • (1)

    In the ε-graph setting, we show that the eigenvalues of the graph Laplacian converge (with rates), provided that ε scales likelog(n)1/mn1/mε1. This result is valid for all m1. This improves the results in [22] by removing an additional logarithmic term. In a sense, the lower bound on the allowed values for ε for the convergence to hold is an optimal requirement due to the connectivity threshold results for random geometric graphs [42].

  • (2)

    In the ε-graph case, when ε scales likeC(log(n)n)1m+4ε1, we show that the rate of convergence of eigenvalues coincides with the pointwise convergence rates of the graph Laplacian (e.g. [32]), and in particular with high probability scale linearly in the connectivity length-scale ε. If we choose ε=C(log(n)n)1/(m+4), then we obtain convergence rates of O(n1/(m+4)), up to log factors, which is sharper than the O(n1/2m) convergence rate from [22] when m5.

  • (3)

    We establish convergence rates for eigenfunctions under L2-type distances that will be made explicit later on. In particular, in the same regime for ε given in (1.1), we establish that the rate of convergence of eigenvectors scales linearly in ε, matching the convergence rate of eigenvalues as well as the pointwise convergence rates. Thus, choosing again ε=C(log(n)n)1/(m+4), we obtain convergence rates for eigenvectors of O(n1/(m+4)), which is far sharper than the O(n1/4m) convergence rates from [22].

A second main contribution of our work is to provide spectral consistency results for graph Laplacians constructed from k-NN graphs. Our work is the first one to obtain any rates of convergence in such a setting. Moreover, in proving the spectral convergence we also obtain rates for pointwise convergence which to the best of our knowledge are also new in the literature. There are very few works in the literature that we are aware of that have rigorously addressed consistency for graph Laplacians associated to k-NN graphs. In [50] pointwise convergence is analyzed (without providing any rates). In [21] asymptotic spectral convergence is discussed, but no rates are provided. In [19], pointwise consistency with rates is established for the game-theoretic p-Laplacian on k-NN graphs. Posterior work to the first version of this paper, like that in [15], have considered more general normalizations for k-NN graphs in order to induce different ways in which data density affects the behavior of data analysis algorithms.

In practical applications, k-NN graphs are almost always preferred over ε-graphs, due to their far better sparsity and connectivity properties (see, e.g., [10], [19] for semi-supervised learning, and [55] for spectral clustering). Since the k-nearest neighbor relation is not symmetric, k-NN graphs are normally symmetrized in order to ensure the graph Laplacian is self-adjoint and the spectrum real-valued. On a symmetrized k-NN graph, the local neighborhood is no longer a Euclidean or geodesic ball, and is in fact not even symmetric. This raises technical difficulties in obtaining pointwise consistency results with rates, and makes the analysis far more involved than it is for ε-graphs.

Our contributions in this setting are as follows:

  • (1)

    We provide spectral convergence rates for graph Laplacians when the graph is a k-NN graph, provided k scales likelog(n)kn. This result is valid for all m1. Moreover, we show that the rates of convergence coincide with the pointwise convergence rates (see Theorem 3.7 below) when k scales likeClog(n)mm+4n4m+4log(n)mm+4kn.

  • (2)

    We establish convergence rates for eigenvectors under different topologies of interest that will be discussed later on. Moreover, in the regimeClog(n)mm+4n4m+4log(n)mm+4kn, the convergence rate for eigenvectors coincides with the convergence rates of eigenvalues and also the pointwise convergence rates from Theorem 3.7.

It is worth mentioning that all our estimates hold with high probability for finite (although possibly large) n. These results imply a quantitative improvement to a large body of work that has built on previous spectral convergence results. For example the works [22], [23], [26] get directly benefited from our new estimates.

There are two essential steps in our analysis that allow us to improve in several regards the rates presented in [22] for the ε-graph case. In the first step, we use a simple modification of the construction of discretization and interpolation maps introduced in [7] (a construction that was later used in [22], though cast in the language of optimal transport), in order to prove spectral convergence (with rates) for a wider range of scalings of ε valid for all dimensions m2. A more detailed outline of the construction of these maps and a discussion on what needs to be adjusted from [22] is discussed in Section 2.4 below.

The second step in our analysis makes use of a simple argument for comparing eigenvalues of different self-adjoint operators. To illustrate the idea, let A,B:HH be linear operators on a Hilbert space H, with A self-adjoint. Let u be an eigenfunction of A with eigenvalue λu, and let w be an eigenfunction of B with eigenvalue λw. We may assume uH=wH=1. Since A is self-adjointλuu,wH=Au,wH=u,AwH=λwu,wH+u,(AB)wH, and thus|λuλw|AwBwH|u,wH|. This inequality allows us to convert pointwise estimates on AwBwH into estimates on the spectrum, provided u,wH is bounded away from zero. For graph Laplacians, A should, say, represent the graph Laplacian, while B represents the continuum (weighted) Laplace-Beltrami operator (or, more accurately, its restriction to the graph). The key ingredients in our proof are good pointwise estimates, which rely essentially on the regularity of the continuum eigenfunctions, and the a priori eigenfunction convergence rate from the first step of our analysis, which ensures u,wH is bounded away from zero. The bottom line is that our a priori (non-optimal) spectral convergence rates can be bootstrapped to make them coincide with the pointwise consistency rates, provided we are willing to shrink the allowed asymptotic scaling for ε slightly. The consistency of eigenfunctions will be a consequence of the a priori convergence rate for eigenvalues. This is made explicit by following some of the steps in the proof of the classical Davis-Kahan theorem.

Regarding the results for k-NN graphs, we first notice that these types of graphs can be thought of intuitively as ε-graphs where one allows ε to vary in space. Given the inhomogeneity of the natural length scale ε (and which intuitively is influenced by data density), the first part of our analysis must rely on the definition of new discretization and interpolation maps that are tailored to the inhomogeneous length-scale setting. After a careful analysis, we are able to provide a priori spectral convergence rates analogous to the a priori rates obtained for ε-graphs. These non-optimal rates can then be bootstrapped to improve them just as in the ε-graph case, using the pointwise consistency results that we derive in Theorem 3.7. We note that the pointwise consistency results for graph Laplacians on k-NN graphs do not follow directly from viewing the graph as an ε-graph with ε varying in space. Indeed, looking forward to the proof of Theorem 3.7, the local neighborhood on a mutual (or exclusive) k-NN graph is asymptotically non-symmetric, due to non-uniformity of the data distribution, and so pointwise consistency for k-NN graph Laplacians requires a far more careful analysis than for ε-graphs, where the local neighborhoods are balls.

The rest of the paper is organized as follows. In Section 2 we give the precise set-up used throughout the paper, state our assumptions, and present our main results. Specifically, Section 2.1 contains the precise definitions of the graph constructions that we study. In Section 2.2 we state our main results regarding convergence of eigenvalues for both ε-graphs as well as k-NN graphs, and in Section 2.3 we present the results regarding convergence of eigenvectors. In Section 2.4 we provide an outline of our proofs. In Section 3 we present the pointwise consistency results of graph Laplacians which will be needed later on. In Section 4 we present the proofs of our main results. More specifically, in Section 4.1 we present the analysis for the ε-graph case, and in Section 4.2 for k-NN graphs. In Section 5 we discuss other modes of convergence for eigenvectors, and in particular the TL2-convergence which implies Wasserstein convergence of Laplacian embeddings. In Section 6 we present the results of numerical experiments for convergence rates for k-NN and ε-graph Laplacian spectra to the spherical harmonics on the two dimensional sphere, and we conclude in Section 7.

Section snippets

Set-up and main results

Let M be a compact, connected, orientable, smooth m-dimensional manifold embedded in Rd. We give to M the Riemannian structure induced by the ambient space Rd. With respect to the induced metric tensor, we let VolM be M's volume form and we let μ be a probability measure supported on M with density (with respect to the volume form) ρ:M(0,) which we assume is bounded, and bounded away from zero, i.e.0<ρminρρmax<, and is at least C2,ϑ(M), which here should be interpreted as saying that ρ,

Pointwise consistency of graph Laplacians

In this section we prove pointwise consistency with a linear rate for our two constructions of the graph Laplacian. The ε-graph Laplacian is considered in Section 3.3, while the undirected k-NN Laplacian is considered in Section 3.4. For the ε-graph Laplacian, the linear rate was established earlier in [32]; we give a simpler proof for completeness. Consistency for k-NN graph Laplacians was studied in [50], but the methods used were unable to establish any convergence rates.

Before giving the

Proofs of main results

Here we prove all of our main results. The structure of the proofs is exactly the same for the ε-graph and the k-NN settings.

Convergence of eigenvectors in TL2 and convergence of graph Laplacian embeddings

To establish Theorem 2.10 it will be convenient to recall the definition of the TL2 space presented in [27]. Here we consider RL-valued functions.

We define the setTL2(M;RL):={(γ,H):γP(M),HL2(γ;RL)}, and the metric(dTL2((γ,H),(γ˜,H˜)))2:=minπΓ(γ,γ˜)M×MdM(x,y)2dπ(x,y)+RL×RL|H(x)H˜(y)|2dπ(x,y). In the above P(M) denotes the space of Borel probability measures on M, and for γP(M), L2(γ;RL) denotes the RL-valued L2 functions with respect to γ. Γ(γ,γ˜) denotes the set of couplings or

Numerical experiments

To test the convergence rates in our main results, we ran some numerical experiments on the two dimensional sphere.1 The density is thus ρ=1/((m+1)αm+1). The experiments used n=212 up to n=217=131072 points independently and uniformly distributed on the sphere, with the errors averaged over 100 trials. For the k-NN graph Laplacian, see (2.9), we setk=n4m+4, as required in (1.2) for our main

Conclusion

In this paper we have obtained new results on the spectral convergence of graph Laplacian operators built from random data towards weighted Laplace-Beltrami operators on smooth compact manifolds without boundary. Our results contribute to the growing manifold learning literature in several regards. First, we improve existing spectral convergence rates for Laplacians based on ε-graphs in the regime (log(n)n)1/(m+4)ε1, showing that the spectral convergence rate scales like ε with very high

References (57)

  • N. García Trillos et al.

    A variational approach to the consistency of spectral clustering

    Appl. Comput. Harmon. Anal.

    (2018)
  • M. Maier et al.

    Optimal construction of k-nearest-neighbor graphs for identifying noisy clusters

    Theor. Comput. Sci.

    (2009)
  • A. Singer

    From graph to manifold Laplacian: the convergence rate

    Appl. Comput. Harmon. Anal.

    (2006)
  • R.K. Ando et al.

    Learning on graph with Laplacian regularization

  • M. Belkin et al.

    Laplacian eigenmaps and spectral techniques for embedding and clustering

  • M. Belkin et al.

    Towards a theoretical foundation for Laplacian-based manifold methods

  • M. Belkin et al.

    Manifold regularization: a geometric framework for learning from labeled and unlabeled examples

    J. Mach. Learn. Res.

    (2006)
  • A. Bertozzi et al.

    Uncertainty quantification in graph-based classification of high dimensional data

    SIAM/ASA J. Uncertain. Quantificat.

    (2018)
  • S. Boucheron et al.

    Concentration Inequalities: A Nonasymptotic Theory of Independence

    (2013)
  • D. Burago et al.

    A graph discretization of the Laplace-Beltrami operator

    J. Spectr. Theory

    (2014)
  • J. Calder

    The game theoretic p-Laplacian and semi-supervised learning with few labels

    Nonlinearity

    (2018)
  • J. Calder

    Consistency of Lipschitz learning with infinite unlabeled data and finite labeled data

    SIAM J. Math. Data Sci.

    (2019)
  • J. Calder et al.

    Poisson learning: graph based semi-supervised learning at very low label rates

  • J. Calder et al.

    Properly-weighted graph Laplacian for semi-supervised learning

    Special Issue on Optimization in Data Science

    Appl. Math. Optim.

    (2019)
  • J. Calder et al.

    Rates of convergence for Laplacian semi-supervised learning with low labelling rates

  • M. Caroccia et al.

    Mumford-Shah functionals on graphs and their asymptotics

  • O. Chapelle et al.

    Semi-Supervised Learning

    (2006)
  • X. Cheng et al.

    Convergence of Graph Laplacian with knn Self-Tuned Kernels

    (2020)
  • R.R. Coifman et al.

    Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps

    Proc. Natl. Acad. Sci.

    (2005)
  • M.P. do Carmo

    Riemannian Geometry

    (1992)
  • M.M. Dunlop et al.

    Large data and zero noise limits of graph-based semi-supervised learning algorithms

    Appl. Comput. Harmon. Anal.

    (2019)
  • M. Flores et al.

    Algorithms for Lp-based semi-supervised learning on graphs

  • N. Fournier et al.

    On the rate of convergence in Wasserstein distance of the empirical measure

    Probab. Theory Relat. Fields

    (2015)
  • N. García Trillos

    Variational limits of k-nn graph-based functionals on data clouds

    SIAM J. Math. Data Sci.

    (2019)
  • N. García Trillos et al.

    Spectral convergence of the graph Laplacian on random geometric graphs towards the Laplace Beltrami operator

    Found. Comput. Math.

    (2019)
  • N. García Trillos, F. Hoffmann, B. Hosseini, Geometric structure of graph Laplacian embeddings, preprint,...
  • N. García Trillos et al.

    On the consistency of graph-based Bayesian learning and the scalability of sampling algorithms

  • N. García Trillos et al.

    A maximum principle argument for the uniform convergence of graph Laplacian regressors

  • Cited by (0)

    JC was supported by NSF-DMS grant 1713691. NGT was supported by NSF grant DMS 1912802.

    View full text