Sampling properties of color Independent Component Analysis

doi:10.1016/j.jmva.2020.104692

Journal of Multivariate Analysis

Volume 181, January 2021, 104692

https://doi.org/10.1016/j.jmva.2020.104692 Get rights and content

Abstract

Independent Component Analysis (ICA) offers an effective data-driven approach for blind source extraction encountered in many signal and image processing problems. Although many ICA methods have been developed, they have received relatively little attention in the statistics literature, especially in terms of rigorous theoretical investigation for statistical inference. The current paper aims at narrowing this gap and investigates the statistical sampling properties of the colorICA (cICA) method. The cICA incorporates the correlation structure within sources through parametric time series models in the frequency domain and outperforms several existing ICA alternatives numerically. We establish the consistency and asymptotic normality of the cICA estimates, which then enables statistical inference based on the estimates. These asymptotic properties are further validated using simulation studies.

Introduction

Independent component analysis (ICA) offers an effective data-driven approach for blind source extraction encountered in many signal and image processing problems. The ICA problem can be formally described by viewing the observed signal matrix $X$ as a linear combination (mixture) of independent latent random variables (sources) $S$ so that $\underset{M \times T}{X} = \underset{M \times M}{W^{- 1}} \underset{M \times T}{S},$ where each column of the mixture $X$ , $X (t), t \in {0, 1, \dots, T - 1}$ , is observed at $M$ channels for $T$ time points, the matrix $W$ is non-random and known as the unmixing matrix, and each column $S (t)$ of the source matrix $S$ contains $M$ unknown independent sources. The objective of ICA is to estimate the unmixing matrix $W$ to recover the hidden sources through $S = W X$ .

There exist many ICA methods including Infomax [5], [27], maximum likelihood estimation [12], JADE [8], fastICA [20], probabilistic ICA [1], estimating score functions [40], kernel [3], [9], smoothing splines [18], logsplines [26], and log-concave projection [38]. In addition, ICA based on two scatter matrices have been developed [23], [33], [35]. For additional extensive literature review, see [13] and [20].

The theoretical properties of the methods mentioned above have been studied. Examples include methods for the fourth moment based ICA algorithms, such as fast ICA and JADE, [32], [34], [39], [42] and semiparametric approaches including [10], [11], [16], [22], [24], [37], [38]. They study $\sqrt{T}$ -consistency (in some cases, asymptotic normality) of the unmixing matrix $W$ under some smooth density assumptions on the underlying sources $S$ . Recently [29] proposed an ICA method based on distance covariance and showed its consistency.

The above papers intrinsically assume that each source is independent and identically distributed. However, it may not be so desirable to apply these instantaneous ICA methods to many practical applications where each independent source is known to possess specific correlation structures. For example, in brain imaging studies, the independent signals usually are auto-correlated. [28] proposed a color ICA (cICA) procedure that makes uses of the spectral density functions of the source signals instead of their marginal density functions. The cICA models the sources using parametric time series models such as autoregressive moving average (ARMA) linear processes and estimates the model parameters as well as the unmixing matrix $W$ through maximizing the Whittle likelihood [43], obtained in the spectral domain. The cICA numerically outperformed other ICA methods for the auto-correlated sources.

The purpose of this paper is to study the theoretical properties of the cICA to strengthen the cICA application to areas such as statistical inferential issues arising in medical brain imaging studies and signal processing problems. The main obstacle here is the identifiability of the mixing matrix, which has been examined in the i.i.d. [10], [11] and autocorrelated [6] settings using the equivalent class concept. In particular, a second-order blind identification (SOBI) algorithm based on the sample autocovariance matrices was introduced along with some sampling properties [6], [30]. In the context of auto-correlated sources, there are several differences in cICA and SOBI. First, cICA is formulated in the frequency domain while SOBI is in the time domain in which the estimation of the sample autocovariance is not efficient [7]. Second, cICA estimates both the mixing matrix and the color source spectral parameters. In contrast, SOBI used asymptotic properties of the sample auto-covariance matrix to approximate the interference to signal ratio (ISR) [6]. Third, cICA exploits the sampling properties of FFT leading to the Whittle likelihood based estimation of the latent parameters and the associated results will be exact under the Gaussian assumption [14]; while this assumption is a leading case in cICA (that is, for motivation only) but it plays a critical role in developing the SOBI asymptotic properties. Lastly, the Whittle likelihood approach allows one to formulate various statistical inferential procedures using the standard asymptotic theory.

In this paper, we continue to exploit the ideas of cICA and extend the earlier work of [28] with several aims. The first aim is to establish the sampling properties by proving the consistency and asymptotic normality of the cICA estimates, for both the linear process parameters and the entries of the unmixing matrix $W$ . The second aim is to study the theoretical properties of cICA when combined with prewhitening (or sphering), which is routinely performed prior to ICA to reduce computational intensity. This extension over [28] enables one to deal with arbitrary non-orthogonal mixing matrices, while the earlier work only studies scenarios with orthogonal mixing matrices. Finally, we perform extensive simulation studies to provide empirical finite-sample validation for our theoretical results. The desirable numerical properties presented here further support the practical applicability of cICA.

The remainder of the paper is organized as follows. Section 2 provides a brief background on multivariate spectral density estimation, the Whittle likelihood approach for parameter estimation, and the cICA method. Section 3 presents the main theorems that state the $\sqrt{T}$ -consistency and asymptotic normality of the cICA estimates. Section 4 reports the simulation studies. All the proofs are provided in Section 5. Some theoretical derivations and computational details can be found in the Supplementary document.

Section snippets

Color independent component analysis

In this section, we briefly review the Whittle likelihood and the cICA method, to the extent that it is sufficient for our theoretical development in Section 3.

Let $X (t), t \in {0, \dots, T - 1}$ , be an $M$ -vector-valued series that is generated according to (1) where the latent source $S (t)$ is a $M$ -vector valued stationary time series with spectral density function $f$ . Suppose the components $S_{1} (t), \dots, S_{M} (t)$ of the source series are mutually independent. The objective of cICA is to estimate $W$ and $f$ from the

Main results

In this section, we state our main theoretical results of the cICA for general temporally dependent sources: consistency in Theorem 1 and asymptotic normality in Theorem 2. The properties of the cICA when being applied to white noise sources are provided in the Supplementary document.

In the ICA model (1), assume that $S_{j} (t)$ , $j \in {1, \dots, M}$ , are mutually independent linear processes with $E (S_{j} (t)) = 0$ and $E (S_{j}^{2} (t)) < \infty$ given by $S_{j} (t) = \sum_{k = 0}^{\infty} a_{j k} (β_{j}) ϵ_{j} (t - k), ϵ_{j} \sim WN (0, σ_{j}^{2}), t \in {0, \dots, T} .$

For each $j$ , the coefficients $a_{j k}$ are

Simulation studies

In this section, we performed simulation studies to provide finite-sample empirical evidence for the theoretical properties of the cICA estimates. Both consistency and asymptotic normality, as described in Theorem 1, Theorem 2 were examined. Besides the finite-sample validation of the theoretical properties, the current numerical studies differ from those of [28] in two aspects: (1) we employ nonorthogonal mixing matrices, while Lee et al. only considered orthogonal mixing matrices; (2)

Proofs of main results

Proofs of Theorem 1, Theorem 2 will be given in this section. The technical details to prove Lemma 1, Lemma 2, Lemma 3, Lemma 4, Lemma 5 are provided in the Supplementary document.

CRediT authorship contribution statement

Seonjoo Lee: Methodology, Formal analysis, Investigation, Writing - original draft. Haipeng Shen: Methodology, Writing - review & editing. Young Truong: Conceptualization, Methodology, Supervision, Writing - review & editing.

Acknowledgments

Lee’s work is partially supported by NIH, *USA grants K01AG051348 and R01AG062578-01A1. Shen’s work is partially supported by the Ministry of Science and Technology Major Project of China 2017YFC1310903, University of Hong Kong Stanley Ho Alumni Challenge Fund, HKU University Research Committee Seed Funding Award 104004215, and BRC Fund, Hongkong, China .

References (44)

BasiriS. et al.
Enhanced bootstrap method for statistical inference in the ICA model
Signal Process.
(2017)
ComonP.
Independent component analysis, a new concept?
Signal Process.
(1994)
IlmonenP.
On asymptotical properties of the scatter matrix based estimates for complex valued independent component analysis
Statist. Probab. Lett.
(2013)
IlmonenP. et al.
Characteristics of multivariate distributions and the invariant coordinate system
Statist. Probab. Lett.
(2010)
OllilaE. et al.
Complex-valued ICA based on a pair of generalized covariance matrices
Comput. Statist. Data Anal.
(2008)
RiceJ.
On the estimation of the parameters of a power spectrum
J. Multivariate Anal.
(1979)
AllassonniereS. et al.
A stochastic algorithm for probabilistic independent component analysis
Ann. Appl. Stat.
(2012)
AmariS. et al.
A new learning algorithm for blind signal separation
Adv. Neural Inf. Process. Syst.
(1996)
BachF. et al.
Kernel independent component analysis
J. Mach. Learn. Res.
(2003)
BellA. et al.
An information-maximization approach to blind separation and blind deconvolution
Neural Comput.
(1995)

BelouchraniA. et al.

A blind source separation technique using second-order statistics

IEEE Trans. Signal Process.

(1997)

BrockwellP.J. et al.

CardosoJ. et al.

Blind beamforming for non-Gaussian signals

ChenA.

Fast kernel density independent component analysis

ChenA. et al.

Consistent independent component analysis and prewhitening

IEEE Trans. Signal Process.

(2005)

ChenA. et al.

Efficient independent component analysis

Ann. Statist.

(2006)

ComonP. et al.

Handbook of Blind Source Separation: Independent Component Analysis and Applications

(2010)

DzhaparidzeK.

GiraitisL. et al.

Whittle estimator for finite-variance non-Gaussian time series with long memory

Ann. Statist.

(1999)

HallinM. et al.

R-estimation for asymmetric independent component analysis

J. Amer. Statist. Assoc.

(2015)

HannanE.J.

The asymptotic theory of linear time-series models

J. Appl. Probab.

(1973)

HastieT. et al.

Independent components analysis through product density estimation

Cited by (1)

A KPI-based Performance Management Framework for Commercial Vehicle Automatic Transmissions
2021, Proceedings of 2021 IEEE 10th Data Driven Control and Learning Systems Conference, DDCLS 2021

View full text

Sampling properties of color Independent Component Analysis

Abstract

Introduction

Section snippets

Color independent component analysis

Main results

Simulation studies

Proofs of main results

CRediT authorship contribution statement

Acknowledgments

Signal Process.

Signal Process.

Statist. Probab. Lett.

Statist. Probab. Lett.

Comput. Statist. Data Anal.

J. Multivariate Anal.

A stochastic algorithm for probabilistic independent component analysis

Ann. Appl. Stat.

A new learning algorithm for blind signal separation

Adv. Neural Inf. Process. Syst.

Kernel independent component analysis

J. Mach. Learn. Res.

An information-maximization approach to blind separation and blind deconvolution

Neural Comput.

A blind source separation technique using second-order statistics

IEEE Trans. Signal Process.

Blind beamforming for non-Gaussian signals

Fast kernel density independent component analysis

Consistent independent component analysis and prewhitening

IEEE Trans. Signal Process.

Efficient independent component analysis

Ann. Statist.

Handbook of Blind Source Separation: Independent Component Analysis and Applications

Whittle estimator for finite-variance non-Gaussian time series with long memory

Ann. Statist.

R-estimation for asymmetric independent component analysis

J. Amer. Statist. Assoc.

The asymptotic theory of linear time-series models

J. Appl. Probab.

Independent components analysis through product density estimation