Compressibility analysis of asymptotically mean stationary processes

This paper is dedicated to Jorge Eugenio Silva (1949-2021).
https://doi.org/10.1016/j.acha.2021.08.002Get rights and content

Abstract

This work provides new results for the analysis of random sequences in terms of p-compressibility. The results characterize the degree in which a random sequence can be approximated by its best k-sparse version under different rates of significant coefficients (compressibility analysis). In particular, the notion of strong p-characterization is introduced to denote a random sequence that has a well-defined asymptotic limit (sample-wise) of its best k-term approximation error when a fixed rate of significant coefficients is considered (fixed-rate analysis). The main theorem of this work shows that the rich family of asymptotically mean stationary (AMS) processes has a strong p-characterization. Furthermore, we present results that characterize and analyze the p-approximation error function for this family of processes. Adding ergodicity in the analysis of AMS processes, we introduce a theorem demonstrating that the approximation error function is constant and determined in closed-form by the stationary mean of the process. Our results and analyses contribute to the theory and understanding of discrete-time sparse processes and, on the technical side, confirm how instrumental the point-wise ergodic theorem is to determine the compressibility expression of discrete-time processes even when stationarity and ergodicity assumptions are relaxed.

Introduction

Quantifying sparsity and compressibility for random sequences has been a topic of active research largely motivated by the results on sparse signal recovery and compressed sensing (CS) [1], [2], [3], [4], [5], [6], [7]. Sparsity and compressibility can be understood, in general, as the degree to which one can represent a random sequence (perfectly and loosely, respectively) by its best k-sparse version in the non-trivial regime when k (the number of significant coefficients) is smaller than the signal or ambient dimension. Various forms of compressibility for a random sequence have been used in signal processing problems, for instance in regression [8], signal reconstruction (in the classical random Gaussian linear measuring setting used in CS) [2], [3], and inference-decision [9], [10]. Compressibility for random sequences has also been used to analyze continuous-time processes [7], [5], for instance, in periodic generalized Lévy processes [11].

A discrete time process is a well-defined infinite dimensional random object, however, the standard approach used to measure compressibility for finite dimensional signals (based on the rate of decay of the absolute approximation error) does not extend naturally for this infinite dimensional analysis. Addressing this issue, Amini et al. [1] and Gribonval et al. [2] proposed the use of a relative approximation error analysis to measure compressibility with the objective of quantifying the rate of the best k-approximation error with respect to the energy of the signal when the number of significant coefficients scales at a rate proportional to the dimension of the signal. This approach offered a meaningful way to determine the energy — and more generally the p-norm — concentration signature of independent and identically distributed (i.i.d.) processes [1], [2]. In particular, they introduced the concept of p-compressibility to name a random sequence that has the capacity to concentrate (with very high probability) almost all their p-relative energy in an arbitrary small number of coordinates (relative to the ambient dimension) of the canonical or innovation domain.

Two important results were presented for i.i.d. processes. [1, Theorem 3] showed that i.i.d. processes with heavy tail distribution (including the generalized Pareto, Student's t and log-logistic) are p-compressible for some p-norms. On the other hand, [1, Theorem 1] showed that i.i.d. processes with exponentially decaying tails (such as Gaussian, Laplacian and Generalized Gaussians) are not p-compressible for any p-norm. Completing this analysis, Silva et al. [3] stipulated a necessary and sufficient condition over the process distribution to be p-compressible (in the sense of Amini et al. [1, Def. 6]) that reduces to look at the p-moment of the 1D marginal stationary distribution of the process.

Importantly, the proof of the result in [3] was rooted in the almost sure convergence of two empirical distributions (random objects function of the process) to their respective probabilities as the number of samples goes to infinity.1 This argument offered the context to move from using the law of large numbers (to characterize i.i.d. processes) to the use of the point-wise ergodic theorem [12], [13]. Then a necessary and sufficient condition for p-compressibility was obtained for the family of stationary and ergodic sources under the mild assumption that the process distribution projected on one coordinate, i.e., its 1D marginal distribution on (R,B(R)), has a density [3, Theorem 1]. Furthermore, for non p-compressible processes, Silva et al. [3] provided a closed-form expression for the so called p-approximation error function, meaning that a stable asymptotic value of the relative p-approximation error is obtained when the rate of significant coefficients is given (fixed-rate analysis).

Considering that the proof of [3, Theorem 1] relies heavily on an almost sure (with probability one) convergence of empirical means to their respective expectations, the idea of relaxing some of the assumptions of the process, in particular stationarity, suggests an interesting direction in the pursuit of extending results for the analysis of p-compressibility for general discrete time processes. In this work, we extend the compressibility analysis for a family of random sequences where stationarity or ergodicity is not assumed by examining the rich family of processes with ergodic properties and, in particular, the important family of asymptotically mean stationary (AMS) processes [12], [14]. This family of processes has been studied in the context of source coding and channel coding problems where its ergodic properties (with respect to the family of indicator functions) have been used to extend fundamental performance limits in source and channel coding problems. Our interest in AMS processes centers on the fact that the p-characterization in [3] is fundamentally rooted in a form of ergodic property over a family of indicator functions; this family of measurable functions is precisely where AMS sources have (by definition) a stable almost-sure asymptotic behavior [12].

Specifically, we apply a more refined and relevant (sample-wise) almost sure fixed-rate analysis of p-approximation errors, first considered by Gribonval et al. [2], to the analysis of a process. Through this analysis, we determine the relationship between the rate of significant coefficients and the p-approximation of a process in two main results. Our first main result (Theorem 1) shows that this rate vs. approximation error has a well-defined expression function of the process distribution—in particular the stationary mean of the process—for the complete collection of AMS and ergodic processes. This result relaxes stationarity as well as some of the regularity assumptions used in [3, Theorem 1]; consequently, it is a significant extension of that result. As a direct implication of this theorem, we extend the dichotomy of the p-compressible process presented in [3, Theorem 1] to the family of AMS ergodic processes (in Theorem 1 in Section 3.3).

The second main result of this work (Theorem 3) uses the ergodic decomposition theorem (EDT) [12] to extend the strong p-characterization to the family of AMS processes, where ergodicity and the stationarity assumptions on the process have been relaxed. Remarkably, we show that this family of processes do have a stable (almost sure) asymptotic p-approximation error for any given rate of significant coefficients as the block of the analysis tends to infinity. Interestingly, this limiting value is in general a measurable (non-constant) function of the process, which is fully determined by the so-called ergodic decomposition (ED) function that maps elements of the sample space of the process to stationary and ergodic components [12].

The rest of the paper is organized as follows. Section 2 introduces notations, preliminary results and some basic elements of the p-compressibility analysis. In particular, Section 2.1 introduces the fixed-rate almost sure approximation error analysis that is the focus of this work. Sections 3 and 4 present the two main results of this paper for AMS processes. The summary and final discussion of the results are presented in Section 5. To conclude, Section 6 provides some context for the construction of AMS processes based on the basic principle of passing an innovation process through deterministic (coding) and random (channel) processing stages. Section 7 presents a numerical strategy to estimate the p-approximation error function and some examples to illustrate the main results of this work. The proofs of the two main results (Theorem 1, Theorem 3) are presented in Sections 8 and 9, respectively, while the proofs of supporting results are relegated to the Appendices.

Section snippets

Preliminaries

For any vector xn=(x1,..,xn) in Rn, let (xn,1,..,xn,n)Rn denote the ordered vector such that |xn,1||xn,2||xn,n|. For p>0 and k{1,..,n}, letσp(k,xn)(|xn,k+1|p++|xn,n|p)1p, be the best k-term p-approximation error of xn, in the sense that ifΣkn{xnRn:σp(k,xn)=0} is the collection of k-sparse signals, then σp(k,xn)=minx˜nΣkn||xnx˜n||p.

Amini et al. [1] and Gribonval et al. [2] proposed the following relative best k-term p-approximation errorσ˜p(k,xn)σp(k,xn)||xn||p[0,1],k{1,..,n},

Strong p-characterization for AMS and ergodic processes

To present the main result of this section, we first need to introduce some notations and definitions for the statement of Theorem 1. Let X=(Xn)n1 be an AMS process with stationary mean μ¯={μ¯n:n1}. If the p-moment of the 1D marginal μ¯1 is well defined, i.e., R|x|pdμ¯1(x)<, then we can introduce the following induced probability vpP(R) with vpμ¯1 byvp(B)B|x|pdμ¯1(x)R|x|pdμ¯1(x),BB(R). In addition, let us define the following tail sets:Bτ(,τ][τ,) and Cτ(,τ)(τ,), for any τ

Strong p-characterization for AMS processes

Relaxing the ergodic assumptions for an AMS source is the focus of this part. It is worth noting that the ergodic result in Theorem 1 will be instrumental for this analysis in view of the ergodic decomposition (ED) theorem for AMS sources nicely presented in [12, Ths. 8.3 and 10.1] and references therein. In a nutshell, the ED theorem shows that the stationary mean of an AMS process (see Definition 9) can be decomposed as a convex combination of stationary and ergodic distributions (called the

Summary and discussion of the results

In this work, we revisit the notion of p-compressibility focusing on the study of the almost sure (with probability one) limit of the p-relative best k-term approximation error when a fixed-rate of significant coefficients is considered for the analysis. We consider the study of processes with general ergodic properties relaxing the stationarity and ergodic assumptions considered in previous work. Interestingly, we found that the family of asymptotically mean stationary (AMS) processes has an

On the construction and processing of AMS processes

To conclude this paper, we provide some context to support the application of our results in Sections 3 and 4. We consider a general generative scenario in which a process is constructed as the output of an innovation source passing through a signal processor (or coding process) and a random corruption (or channel). This scenario permits us to observe a family of operations on a stationary and ergodic source (for example an i.i.d. source) that produces a process with a strong p

Estimating fp,μ¯(r)

The purpose of this last section is two-fold: First, we introduce an estimation strategy to approximate the function fp,μ¯() with an arbitrary precision for an AMS and ergodic process without the need to obtain, in closed-form, its invariant distribution μ¯ (Theorem 1). Second, we illustrate the trend of the rate vs. p-approximation error pairs in (31) for some simple cases of μ¯1P(R) (induced by Gaussian and α-stable distributions [13], [1]). For this analysis, our main focus is the family

Proof of Theorem 1

First, we introduce a number of preliminary results, definitions, and properties that will be essential to elaborate the main argument to prove Theorem 1.

Proof of Theorem 3

First, we introduce formally the ergodic decomposition (ED) theorem:

Theorem 5

[12, Th. 10.1] Let X=(Xn)n1 be an AMS process characterized by (RN,B(RN),μ). Then there is a measurable space given by (Λ,L) that parametrizes the family of stationary and ergodic distribution, i.e., P˜={μλ,λΛ}, and a measurable function Ψ:(RN,B(RN))(Λ,L) such that:

  • i)

    Ψ is invariant with respect to T, i.e., Ψ(x)=Ψ(T(x)) for all xRN.

  • ii)

    Using the stationary mean μ¯ of X and its induced probability in (Λ,L), denoted by WΨ, it

Acknowledgement

This material is based on work supported by grants of CONICYT-Chile, FONDECYT 1210315 and the Advanced Center for Electrical and Electronic Engineering, Basal Project FB0008. I want to thank the two anonymous reviewers for providing valuable comments and suggestions that helped to improve the technical quality and organization of this paper. The author thanks Professor Martin Adams for providing valuable comments about the organization and presentation of this paper. The author thanks Sebastian

References (19)

  • A. Amini et al.

    Compressibility of deterministic and random infinity sequences

    IEEE Trans. Signal Process.

    (2011)
  • R. Gribonval et al.

    Compressible distributions for hight-dimensional statistics

    IEEE Trans. Inf. Theory

    (2012)
  • J.F. Silva et al.

    On the characterization of p-compressible ergodic sequences

    IEEE Trans. Signal Process.

    (2015)
  • V. Cevher

    Learning with compressible priors

  • A. Amini et al.

    Sparsity and infinity divisibility

    IEEE Trans. Inf. Theory

    (2014)
  • M. Unser et al.

    A unified formulation of Gaussian versus sparse stochastic processes — part ii: Discrete domain theory

    IEEE Trans. Inf. Theory

    (2014)
  • M. Unser et al.

    An Introduction to Sparse Stochastic Processes

    (2014)
  • R. Gribonval

    Should penalized least squares regression be interpreted as maximum a posteriori estimation?

    IEEE Trans. Signal Process.

    (2011)
  • A. Amini et al.

    Bayesian estimation for continuous-time sparse stochastic processes

    IEEE Trans. Signal Process.

    (2013)
There are more references available in the full text version of this article.

Cited by (2)

View full text