Abstract
The persistence diagram is an increasingly useful tool from Topological Data Analysis, but its use alongside typical machine learning techniques requires mathematical finesse. The most success to date has come from methods that map persistence diagrams into vector spaces, in a way which maximizes the structure preserved. This process is commonly referred to as featurization. In this paper, we describe a mathematical framework for featurization called template functions, and we show that it addresses the problem of approximating continuous functions on compact subsets of the space of persistence diagrams. Specifically, we begin by characterizing relative compactness with respect to the bottleneck distance, and then provide explicit theoretical methods for constructing compact-open dense subsets of continuous functions on persistence diagrams. These dense subsets—obtained via template functions—are leveraged for supervised learning tasks with persistence diagrams. Specifically, we test the method for classification and regression algorithms on several examples including shape data and dynamical systems.
Similar content being viewed by others
Notes
Homology is computed with coefficients in a field \({\mathbf {k}}\).
References
H. Adams, T. Emerson, M. Kirby, R. Neville, C.Peterson, P.Shipman, S. Chepushtanova, E. Hanson, F. Motta, and L. Ziegelmeier, Persistence images: A stable vector representation of persistent homology, Journal of Machine Learning Research 18 (2017), no. 8, 1–35.
A. Adcock, E. Carlsson, and G. Carlsson, The ring of algebraic functions on persistence bar codes, Homology, Homotopy and Applications 18 (2016), no. 1, 381–402.
R. Anirudh, V. Venkataraman, K. N. Ramamurthy, and P. Turaga, A Riemannian framework for statistical analysis of topological persistence diagrams, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
R. Baire, Sur les fonctions de variables réelles, Annali di Matematica Pura ed Applicata (1898-1922) 3 (1899), no. 1, 1–123.
P. Bendich, J. S. Marron, E. Miller, A. Pieloch, and S. Skwerer, Persistent homology analysis of brain artery trees, The Annals of Applied Statistics 10 (2016), no. 1, 198–218.
G. Benettin, L. Galgani, A. Giorgilli, and J. M. Strelcyn, Lyapunov characteristic exponents for smooth dynamical systems and for hamiltonian systems; a method for computing all of them. part 2: Numerical application, Meccanica 15 (1980), no. 1, 21–30.
J. Berrut and L. N. Trefethen, Barycentric Lagrange interpolation, SIAM Review 46 (2004), no. 3, 501–517.
E. Berry, Y. C. Chen, J. Cisewski-Kehe, and B. T. Fasy, Functional summaries of persistence diagrams, Journal of Applied and Computational Topology 4 (2020), no. 2, 211–262.
A. J. Blumberg, I. Gal, M. A. Mandell, and M. Pancia, Robust statistics, hypothesis testing, and confidence intervals for persistent homology on metric measure spaces, Foundations of Computational Mathematics 14 (2014), no. 4, 745–789.
P. Bubenik, Statistical topological data analysis using persistence landscapes, Journal of Machine Learning Research 16 (2015), 77–102.
P. Bubenik and Alex Elchesen, Universality of persistence diagrams and the bottleneck and wasserstein distances, arXiv preprint arXiv:1912.02563 (2019).
P. Bubenik and T. Vergili, Topological spaces of persistence modules and their properties, Journal of Applied and Computational Topology (2018).
G. Carlsson and V. De Silva, Zigzag persistence, Foundations of computational mathematics 10 (2010), no. 4, 367–405.
G. Carlsson and S. Kalisnik Verovsek, Symmetric and r-symmetric tropical polynomials and rational functions, Journal of Pure and Applied Algebra (2016), 3610–3627.
M. Carrière and U. Bauer, On the metric distortion of embedding persistence diagrams into reproducing kernel hilbert spaces, arXiv:1806.06924 (2018).
M. Carrière, F. Chazal, Y.i Ike, T. Lacombe, M. Royer, and Y. Umeda, Perslay: A neural network layer for persistence diagrams and new graph topological signatures, Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (Palermo, Sicily, Italy) (S. Chiappa and R. Calandra, eds.), Proceedings of Machine Learning Research, vol. 108, PMLR, 2020, pp. 2786–2796.
M. Carrière, M. Cuturi, and S. Oudot, Sliced Wasserstein kernel for persistence diagrams, Proceedings of the 34th International Conference on Machine Learning (Sydney NSW Australia) (D. Precup and Y. W. Teh, eds.), Proceedings of Machine Learning Research, vol. 70, PMLR, 06–11 Aug 2017, pp. 664–673.
M. Carrière, S. Oudot, and M. Ovsjanikov, Stable topological signatures for points on 3d shapes, Computer Graphics Forum 34 (2015), no. 5, 1–12.
F. Chazal, V. de Silva, M. Glisse, and S. Oudot, The structure and stability of persistence modules, Springer International Publishing, Switzerland, 2016.
F. Chazal, B. T. Fasy, F. Lecci, A. Rinaldo, and L. Wasserman, Stochastic convergence of persistence landscapes and silhouettes, Proceedings of the Thirtieth Annual Symposium on Computational Geometry (New York, NY, USA), SOCG’14, ACM, 2014, pp. 474:474–474:483.
Y. C. Chen, D. Wang, A. Rinaldo, and L. Wasserman, Statistical analysis of persistence intensity functions, arXiv preprint arXiv:1510.02502 (2015).
I. Chevyrev, V. Nanda, and H. Oberhauser, Persistence paths and signature features in topological data analysis, IEEE transactions on pattern analysis and machine intelligence 42 (2018), no. 1, 192–202.
M. K. Chung, P. Bubenik, and P. T. Kim, Persistence diagrams of cortical surface data, Information Processing in Medical Imaging (J. L. Prince, D. L. Pham, and K. J. Myers, eds.), Lecture Notes in Computer Science, vol. 5636, Springer Berlin Heidelberg, Williamsburg, VA, USA, 2009, pp. 386–397.
D. Cohen-Steiner, H. Edelsbrunner, and J. Harer, Stability of persistence diagrams, Discrete Comput. Geom. 37 (2007), no. 1, 103–120.
J. B. Conway, A course in functional analysis, vol. 96, Springer, New York, NY, USA, 2013.
R. Corbet, U. Fugacci, M. Kerber, C. Landi, and B. Wang, A kernel for multi-parameter persistent homology, Computers & graphics: X 2 (2019), 100005.
W. Crawley-Boevey, Decomposition of pointwise finite-dimensional persistence modules, Journal of Algebra and its Applications 14 (2015), no. 05, 1550066.
B. Di Fabio and M. Ferri, Comparing persistence diagrams through complex vectors, Image Analysis and Processing — ICIAP 2015, Springer International Publishing, Berlin, Heidelberg, 2015, pp. 294–305.
P. Diaconis, S. Holmes, and M. Shahshahani, Sampling from a manifold, Advances in Modern Statistical Theory and Applications: A Festschrift in honor of Morris L. Eaton, Institute of Mathematical Statistics, 2013, pp. 102–125.
V. Divol and T. Lacombe, Understanding the topology and the geometry of the space of persistence diagrams via optimal partial transport, Journal of Applied and Computational Topology 5 (2021), no. 1, 1–53.
P. Donatini, P. Frosini, and A. Lovato, Size functions for signature recognition, Vision Geometry VII (San Diego, CA, United States) (R. A. Melter, A. Y. Wu, and L. J. Latecki, eds.), SPIE, 1998.
J. P. Eckmann and D. Ruelle, Ergodic theory of chaos and strange attractors, Rev. Mod. Phys. 57 (1985), 617–656.
B. T. Fasy, F. Lecci, A. Rinaldo, L. Wasserman, Sivaraman Balakrishnan, and Aarti Singh, Confidence sets for persistence diagrams, Annals of Statistics 42 (2014), no. 6, 2301–2339.
M. Ferri, P. Frosini, A. Lovato, and C. Zambelli, Point selection: A new comparison scheme for size functions (with an application to monogram recognition), Proceedings of the Third Asian Conference on Computer Vision-Volume I - Volume I (Berlin, Heidelberg), ACCV ’98, Springer-Verlag, 1998, p. 329–337.
G. A. Gottwald and I. Melbourne, A new test for chaos in deterministic systems, Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences 460 (2004), no. 2042, 603–611.
G. A. Gottwald and I. Melbourne, On the validity of the 0–1 test for chaos, Nonlinearity 22 (2009), no. 6, 1367.
G. A. Gottwald and I. Melbourne, The 0-1 test for chaos: A review, Chaos Detection and Predictability (C. Skokos, G. A. Gottwald, and J. Laskar, eds.), Springer, Berlin, Germany, 2016, pp. 221–247.
M. Henon, On the numerical computation of Poincaré maps, Physica D: Nonlinear Phenomena 5 (1982), no. 2, 412 – 414.
S. Kališnik, Tropical coordinates on the space of persistence barcodes, Foundations of Computational Mathematics (2018).
G. Kusano, K. Fukumizu, and Y. Hiraoka, Kernel method for persistence diagrams via kernel embedding and weight factor, Journal of Machine Learning Research 18 (2018), no. 189, 1–41.
G. Kusano, Y. Hiraoka, and K. Fukumizu, Persistence weighted gaussian kernel for topological data analysis, International Conference on Machine Learning, 2016, pp. 2004–2013.
G. Kusano, Y. Hiraoka, and K. Fukumizu, Persistence weighted gaussian kernel for topological data analysis, ICML, 2016.
R. Kwitt, S. Huber, M. Niethammer, W. Lin, and U. Bauer, Statistical topological data analysis - a kernel perspective, Advances in Neural Information Processing Systems 28 (C. Cortes, N.D. Lawrence, D.D. Lee, M. Sugiyama, and R. Garnett, eds.), Curran Associates, Inc., Montreal, Quebec, Canada, 2015, pp. 3052–3060.
T. Le and M. Yamada, Persistence Fisher kernel: A Riemannian manifold kernel for persistence diagrams, 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montréal, Canada., 2018.
M. Lesnick, The theory of the interleaving distance on multidimensional persistence modules, Foundations of Computational Mathematics 15 (2015), no. 3, 613–650 (English).
C. Li, M. Ovsjanikov, and F. Chazal, Persistence-based structural recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1995–2002.
M. McCullough, M. Small, T. Stemler, and H. Ho-Ching Iu, Time lagged ordinal partition networks for capturing dynamics of continuous dynamical systems, Chaos: An Interdisciplinary Journal of Nonlinear Science 25 (2015), no. 5, 053101.
Y. Mileyko, S. Mukherjee, and J. Harer, Probability measures on the space of persistence diagrams, Inverse Problems 27 (2011), no. 12, 124007.
E. Munch, K. Turner, P. Bendich, S. Mukherjee, J. Mattingly, and J. Harer, Probabilistic fréchet means for time varying persistence diagrams, Electron. J. Statist. 9 (2015), 1173–1204.
D. Pachauri, C. Hinrichs, M. K. Chung, S. C. Johnson, and V. Singh, Topology-based kernels with application to inference problems in alzheimer’s disease, IEEE Transactions on Medical Imaging 30 (2011), no. 10, 1760–1770.
T. Padellini and P. Brutti, Persistence flamelets: Multiscale persistent homology for kernel density exploration, arXiv preprint arXiv:1709.07097 (2017).
P. Palaniyandi, On computing Poincaré map by Hénon method, Chaos, Solitons & Fractals 39 (2009), no. 4, 1877 – 1882.
D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono, A. Ben Hamza, A. Bronstein, M. Bronstein, S. Bu, U. Castellani, S. Cheng, V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li, H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, and J. Ye, SHREC’14 track: Shape retrieval of non-rigid 3d human models, Proceedings of the 7th Eurographics workshop on 3D Object Retrieval, EG 3DOR’14, Eurographics Association, 2014.
L. Polanco and J. A. Perea, Adaptive template systems: Data-driven feature selection for learning with persistence diagrams, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), IEEE, 2019, pp. 1115–1121.
J. Reininghaus, S. Huber, U. Bauer, and R. Kwitt, A stable multi-scale kernel for topological machine learning, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
D. Rouse, A. Watkins, D. Porter, J. Harer, P. Bendich, N. Strawn, E. Munch, J. DeSena, J. Clarke, J. Gilbert, P. Chin, and Andrew Newman, Feature-aided multiple hypothesis tracking using topological and statistical behavior classifiers, Signal Processing, Sensor/Information Fusion, and Target Recognition XXIV (Baltimore, Maryland, United States) (I. Kadar, ed.), SPIE, may 2015.
W. Rudin, Real and complex analysis, Tata McGraw-Hill Education, New York, NY, USA, 2006.
M. Sandri, Numerical calculation of lyapunov exponents, The Mathematica Journal 6 (1986), no. 3, 78–84.
N. Singh, H. D. Couture, J. S. Marron, C. Perou, and M. Niethammer, Topological descriptors of histology images, Machine Learning in Medical Imaging: 5th International Workshop, MLMI 2014, Held in Conjunction with MICCAI 2014, Boston, MA, USA, September 14, 2014. Proceedings (Boston, MA, USA) (G. Wu, D. Zhang, and L. Zhou, eds.), Springer International Publishing, 2014, pp. 231–239.
J. Sun, M. Ovsjanikov, and L. Guibas, A concise and provably informative multi-scale signature based on heat diffusion, Proceedings of the Symposium on Geometry Processing (Aire-la-Ville, Switzerland, Switzerland), SGP ’09, Eurographics Association, 2009, pp. 1383–1392.
L. N. Trefethen, Approximation theory and approximation practice (applied mathematics), SIAM, Philadelphia, PA, USA, 2012.
K. Turner, Y. Mileyko, S. Mukherjee, and J. Harer, Fréchet means for distributions of persistence diagrams, Discrete & Computational Geometry 52 (2014), no. 1, 44–70 (English).
S. Tymochko, E. Munch, and F. A. Khasawneh, Adaptive partitioning for template functions on persistence diagrams, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), IEEE, 2019, pp. 1227–1234.
A. Wagner, Nonembeddability of persistence diagrams with\( p> 2\)wasserstein metric, arXiv preprint arXiv:1910.13935 (2019).
M. C. Yesilli, F. A. Khasawneh, and A. Otto, Topological feature vectors for chatter detection in turning processes, The International Journal of Advanced Manufacturing Technology (2022), 1–27.
M. C. Yesilli, S. Tymochko, F. A. Khasawneh, and E. Munch, hatter diagnosis in milling using supervised learning and topological features vector, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), IEEE, 2019, pp. 1211–1218.
Q. Zhao and Y. Wang, Learning metrics for persistence-based summaries and applications for graph classification, Proceedings of the 33rd International Conference on Neural Information Processing Systems (Red Hook, NY, USA), Curran Associates Inc., 2019, pp. 9859–9870.
X. Zhu, A. Vartanian, M. Bansal, D. Nguyen, and L. Brandl, Stochastic multiresolution persistent homology kernel., IJCAI, 2016, pp. 2449–2457.
B. Zieliński, M. Lipiński, M. Juda, M. Zeppelzauer, and P. Dłotko, Persistence bag-of-words for topological data analysis, Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI’19, AAAI Press, 2019, p. 4489–4495.
B. Zieliński, M. Lipiński, M. Juda, M. Zeppelzauer, and P. Dłotko, Persistence codebooks for topological data analysis, Artificial Intelligence Review 54 (2021), no. 3, 1969–2009.
Acknowledgements
JAP acknowledges the support of the National Science Foundation (NSF) under grants DMS-1622301, CCF-2006661, CAREER award DMS-1943758, and DARPA under grant HR0011-16-2-003. EM was supported by the NSF through grants CMMI-1800466, DMS-1800446, CCF-1907591, and CCF-2106578. FAK was supported by the NSF through grants CMMI-1759823 and DMS-1759824.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Shmuel Weinberger.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Implementation of the Interpolating Polynomials Algorithm
In this appendix, we give more details on the implementation of the interpolating polynomials described in Sect. 6.2. The barycentric formula for Lagrange interpolation described by [7] is given by
while \({\mathcal {A}}= \{a_i\}_{i=0}^m \subset {\mathbb {R}}\) is a finite set of distinct mesh values, and \(\{c_i \in {\mathbb {R}}\}\) is a collection of evaluation values. The function in Eq. (A1) has the property that \(f(a_i) = c_i\) for all i, and it also satisfies the partition of unity condition \(\sum \nolimits _{j=0}^{m}{f(x)}= 1, \,\, \forall \, x \).
Barycentric Lagrange interpolation is often used for approximating \({\mathbb {R}}\)-valued functions and there are efficient algorithms for obtaining the weights associated with it. However, in our formulation we need to an interpolating polynomial over an \({\mathbb {R}}^2\)-valued function. Therefore, we next describe how to expand the algorithm for interpolating a scalar valued function to interpolating a function on the plane. Note that the notation used here is self-contained from Sect. 6.2.
We assume that our planar mesh is the outer product of \(m+1\) mesh points along the birth time x-axis, and \(n+1\) points along the lifetime y-axis. We also assume that the persistence diagram has N pairs of (birth, lifetime) points.
-
1.
Get \(\tilde{\gamma }\) and \(\phi \) which correspond to the interpolation matrices along the x-mesh and the y-mesh, respectively. These are the matrices that describe the linear transformation from the \(m+1\) mesh points of birth times (\(n+1\) mesh of lifetimes) to the corresponding interpolated values of the N query birth times (N query lifetimes) for a given diagram. This step is equivalent to separately obtaining the interpolation matrices for the birth times and the lifetimes.
-
2.
Set \(\gamma =\tilde{\gamma }^T\).
-
3.
-
(a)
Replicate each column in \(\gamma \) \(n+1\) times to obtain \(\Gamma \) whose dimensions are \((m+1)\times (N\times (n+1))\).
-
(b)
Unravel \(\phi \) row-wise into a row vector, then replicate each row \(m+1\) times to obtain \(\Phi \) whose dimensions are \((m+1)\times (N\times (n+1))\).
-
(a)
-
4.
Use element-wise multiplication to obtain \(\tilde{\Psi }=\Gamma \cdot \Phi \), where \(\cdot \) means element-wise multiplication, and \(\tilde{\Psi }\) has dimension \((m+1)\times (N\times (n+1))\).
-
5.
-
(a)
Split \(\tilde{\Psi }\) into N chunks of \((m+1)\times (n+1)\) matrices along the columns axis.
-
(b)
Concatenate the split pieces row-wise to obtain an \((N\times (m+1))\times (n+1)\) matrix \(\Psi \).
-
(a)
-
6.
Reshape \(\Psi \) by concatenating each \((m+1)\times (n+1)\) piece row-wise to obtain an \(N \times ((m+1)\times (n+1))\) matrix \(\Xi \).
-
7.
Let the 2D base mesh be given as
$$\begin{aligned} \begin{bmatrix} f_{00} &{} f_{01} &{} \ldots &{} f_{0n} \\ f_{10} &{} f_{11} &{} \ldots &{} f_{1n} \\ \vdots &{} &{} &{} \vdots \\ f_{m0} &{} f_{m1} &{} \ldots &{} f_{mn} \end{bmatrix}, \end{aligned}$$where \(f_{ij} = f(x_i, y_j)\) and \((x_i, y_j)\) is a unique point in the 2D mesh. Define the vector \([f_{00} \, f_{01} \, \ldots \, f_{mn}]\) which is obtained by unraveling the 2D mesh row-wise.
-
8.
We can interpolate the query points \((x_q, y_q)\) using
$$\begin{aligned} p(x_q, y_q) = \begin{bmatrix} \ell _0(x_0) \ell _0(y_0) &{} \ldots &{} \ell _m(x_0) \ell _n(y_0) \\ \ell _0(x_1) \ell _0(y_1) &{} \ldots &{} \ell _m(x_1) \ell _n(y_1) \\ \vdots &{} &{} \vdots \\ \ell _0(x_{N-1}) \ell _0(y_{N-1}) &{} \ldots &{} \ell _m(x_{N-1}) \ell _n(y_{N-1}) \end{bmatrix} \begin{bmatrix} f_{00} \\ f_{01} \\ \vdots \\ f_{mn} \end{bmatrix}. \end{aligned}$$
Here is a sketch of the resulting matrices:
where \(\Gamma \) has dimension \((m+1)\times (N\times (n+1))\).
where \(\Phi \) has dimension \((m+1)\times (N\times (n+1))\).
We can now compute the elementwise product \(\Psi = \Gamma \cdot \Phi \), which has the dimension \((m+1)\times (N\times (n+1))\).
We then need to apply the following operations: (i) reshaping \(\Psi \) to obtain \({\hat{\Psi }}_1\) given by
(ii) unraveling \({\hat{\Psi }}_1\) into an \(N\times ((m+1)\times (n+1))\) matrix \({\hat{\Psi }}_2\) given by
The collection of all the scores constitutes the feature vector corresponding to the chosen base mesh point and to the query points where the latter are the persistence diagram points. In this study we summed the rows of \({\hat{\Psi }}_2\) after taking the absolute value of each entry. The resulting number represents the score at each base mesh point. If the persistence diagram contains the mesh points and we want to find the interpolated values at query points \(p_{\mathrm{interp}}\), then we would compute \(p_{\mathrm{interp.}}={\hat{\Psi }}_2\, f\).
The implementation of this algorithm can be found in the teaspoon package at teaspoon.ML.feature_functions.interp_polynomial.
Appendix B: Additional Shape Data Results
This appendix gives additional results for the SHREC data set described in Sect. 8.4 using tent functions instead of interpolating polynomials. Table 4 should be compared to the results of Table 3.
Rights and permissions
About this article
Cite this article
Perea, J.A., Munch, E. & Khasawneh, F.A. Approximating Continuous Functions on Persistence Diagrams Using Template Functions. Found Comput Math 23, 1215–1272 (2023). https://doi.org/10.1007/s10208-022-09567-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10208-022-09567-7