Smooth kNN Local Linear Estimation of the Conditional Distribution Function

Almanjahie, Ibrahim M.; Elmezouar, Zouaoui Chikr; Laksaci, Ali; Rachdi, Mustapha

doi:10.3390/math9101102

Open AccessArticle

Smooth kNN Local Linear Estimation of the Conditional Distribution Function

¹

Department of Mathematics, College of Science, King Khalid University, Abha 62529, Saudi Arabia

²

Statistical Research and Studies Support Unit, King Khalid University, Abha 62529, Saudi Arabia

³

AGIM Team, Laboratoire AGEIS, EA 7407, Université Grenoble Alpes (France), UFR SHS, BP. 47, CEDEX 09, F38040 Grenoble, France

^*

Author to whom correspondence should be addressed.

Mathematics 2021, 9(10), 1102; https://doi.org/10.3390/math9101102

Submission received: 1 April 2021 / Revised: 9 May 2021 / Accepted: 10 May 2021 / Published: 13 May 2021

(This article belongs to the Section Probability and Statistics)

Download

Browse Figures

Versions Notes

Abstract

:

Previous works were dedicated to the functional k-Nearest Neighbors (kNN) and the local linearity method estimations of a regression operator. In this paper, a sequence pair of

{(X_{i}, Y_{i})}_{i = 1, \dots, n}

of functional mixing observations are considered. We treat the local linear estimation of the cumulative function of

Y_{i}

given functional input variable

X_{i}

. Precisely, we combine the kNN method with the local linear algorithm to construct a new and fast efficiency estimator of the conditional distribution function. The main purpose of this paper is to prove the strong convergence of the constructed estimator under mixing conditions. An application to the functional times series prediction is used to compare our proposed estimator with the existing competitive estimators, and show its efficiency and superiority.

Keywords:

functional mixing data; complete convergence (a.co.); Local Linear Fitting (LLM); distribution function; kernel weighting; conditional predictive region; k nearest neighbors smoothing (kNN)

1. Introduction

In the last decade, the local linearity method (LLM) estimation has become an interesting growing method in nonparametric Functional Data modeling (NPFDM). This topic’s motivation is the superiority of LLM-estimation over the method of the classical kernel weighting method (CKM). Especially, the CKM has a large bias compared to the LLM-estimation (see Fan and Gijbels [1] for a uni-dimensional framework, and Baìllo and Grané [2] for NPFDM setup). Baìllo and Grané [1] used the LLM-algorithm to estimate the hilibertian conditional expectation operator. A generalized LLM-estimation of this nonparametric operator was obtained by Barrientos et al. [3]. They treated the case of Banach explanatory variable. Berlinet et al. [4] built an alternative LLM-estimator of the functional conditional expectation by inverting the local variance-covariance matrix of the functional variable. The asymptotic distribution of the LLM-estimator proposed by Barrientos et al. [3] was obtained by Zhou and Lin [5].

Furthermore, the LLM-estimation of the conditional cumulative distribution function (CCDF) was investigated by Laksaci et al. [6], who established the almost consistency rate for an LLM-estimator of the CCDF-model when the observations have spatio-functional structure. All these previous studies have utilized the kernel local linearity method; however, this paper focuses on CCDF-estimation with a new weighting approach obtained by mixing the local linear fitting to the kNN method.

1.1. Related Works

Recall that the estimation by kNN method has more advantages than the Nadaraya-Watson algorithm. It is an attractive procedure of estimation which is more acclimated to the functional formation of underlying data (see Burba et al. [7] for more motivations on this approach). Notice also that the method of kNN, in the nonparametric functional statistic, has been studied by many researchers (see, for instance, Laloë [8], Kudraszow and Vieu [9] for previous works and Kara et al. [10] for the uniform consistency on the number of neighbors). On the other hand, the kNN estimation under the local linear approach was recently developed by Chikr-Elmezouar et al. [11]. They constructed an estimator of the conditional density by combining the ideas of the local linear method to the kNN weighting techniques. They proved the almost complete consistency of the constructed estimator when the observations are independent identically distributed. Considering the same situation for functional observations in the case of independent identically distributed, Bachir et al. [12] have studied the estimation of the M-regression function. Their estimator was obtained using the kNN approach over the Nadaraya-Watson method. They established the convergence rate of the uniform consistency on the number of the neighbor of the constructed estimator. As an alternative model to the robust regression, Laksaci et al. [13] have treated the kNN estimation of the quantile regression. They stated the property of the built estimator under the independence structure. We refer to Rachedi et al. [14] for the functional regression when the response variable is observed with missing data at random. More recently, Almanjahie et al. [15] have studied the computational aspects of the kNN estimation of some nonparametric functional models, including the conditional density, the regression operator, and the conditional cumulative function. They examined the feasibility of some selector algorithms to choose the best bandwidth parameter in nonparametric functional data analysis.

1.2. Contribution

While the previous works are dedicated to the functional kNN-estimation of the regression operator using the CKM-method, we consider, in this contribution, the kNN estimation problem of the CCDF using the LLM-smoothing. Precisely, we benefit from the attractive features of both the kNN weighting and LLM-fitting by combining the two algorithms to provide a fast efficiency estimator for the CCDF. On the one hand, it is well known that the main reason behind the implementation of the kNN method is its feature to select an attractive smoothing parameter. Specifically, the kNN method permits the selection of a bandwidth parameter more adapted to the local structure of the data. Moreover, this estimator can be updated to any new observations. Such consideration is essential in functional statistics where the asymptotic properties are strongly dependent on the behavior of the local structure. For the latter reason, the kNN method is better than the classical kernel method. However, the difficulty in the kNN smoothing is the fact that the bandwidth parameter is a random variable, unlike the kernel method in which the smoothing parameter is a deterministic scalar. So, the study of the asymptotic properties of this estimator is complicated, and it requires some additional tools and techniques.

On the other hand, it is well documented that the local linear approach is an alternative method to the usual Nadaraya-Watson technique. As discussed in the first paragraph, the estimation by local linear method permits to improve the asymptotic property of the kernel estimator by reducing the bias term. Thus, with this combined approach, we construct an estimator of the CCDF and state their consistency by establishing the almost complete convergence rate. The second novelty of this contribution is establishing the asymptotic results of the estimator when the observations are correlated as mixing functional time series.

1.3. Organization

This paper is structured as follows. Our Methodology describing the kNN-LLM estimator, as well as the functional time series framework, is presented in Section 2. The main asymptotic results with their proofs are also presented in Section 3. Section 4 is devoted to some comments allowing to reveal the very merits of the proposed approach. The performance of the constructed estimator in temperature prediction, compared to existing estimator, using real data is conducted in Section 5. Our conclusion is presented in Section 6.

2. Methodology

2.1. CCDF-Model and Its kNN-LLM Estimator

Consider

(X_{1}, Y_{1}), (X_{2}, Y_{2}), (X_{3}, Y_{3}) \dots, (X_{n}, Y_{n})

be stationary sequence of random vector

(X, Y)

valued in

F \times I R

, where

F

is a separable metric space has a metric d. Let

N_{x}

be the neighborhood of fixed curve

x \in F

, for which we suppose that the conditional cumulative distribution function (CCDF)

F (\cdot | x)

has a continuous conditional density

f (\cdot, x)

. Usually, the LLM-estimator of CCDF is built by treating the function

F (\cdot | x)

as a conditional expectation, i.e.,

I E [H (ℓ^{- 1} (y - Y_{i})) | X_{i} = x] \to F (y | x) as ℓ \to 0,

where H is the cumulative distribution function, and

(ℓ_{n} = ℓ)

is a positive real sequence. In fact, this idea was proposed, first, by Fan and Gijbels [1] in nonfunctional setup. In our functional context, we consider an alternative estimator to that proposed by Almanjahie et al. [16]. It is obtained by approximating

F (y | x)

locally in

N_{x}

by

\forall x_{0} \in N_{x}, F (y | x_{0}) = a_{y x} + b_{y x} d (x_{0}, x) + o (d (x, x_{0})) .

(1)

So, the kNN-LLM estimator is constructed by estimating the operators

a_{y x}

and

b_{y x}

of the formula in (1) as

M i n_{(a, b) \in I R^{2}} \sum_{i = 1}^{n} {(H (ℓ_{l}^{- 1} (y - Y_{i})) - a - b d (X_{i}, x))}^{2} K e r (\frac{d (x, X_{i})}{{I H}_{k}}),

where

K e r (\cdot)

is a kernel function,

ℓ_{l} = min {ℓ \in {I R}^{+}, satisfies \sum_{i = 1}^{n} 1 I_{(y - ℓ, y + ℓ)} (Y_{i}) = l}

, and

{I H}_{k} = min \{h \in {I R}^{+} such that \sum_{i = 1}^{n} 1 I_{B (x, h)} (X_{i}) = k\}

. Then, we prove, later, that the smooth kNN-LLM of the CCDF

F (y | x)

is explicited by

\tilde{F} (y | x) = \frac{\sum_{i, j = 1}^{n} β_{i j} H (ℓ_{l}^{- 1} (y - Y_{i}))}{\sum_{i, j = 1}^{n} β_{i j}},

where

β_{i j} = d (X_{i}, x) (d (X_{i}, x) - d (X_{j}, x)) \times K e r ({I H}_{k}^{- 1} d (x, X_{i})) K e r ({I H}_{k}^{- 1} d (x, X_{j})) .

2.2. Functional Time Series Framework

We study the behavior of asymptotic property of the LLM-estimator based on CCDF

F (\cdot, x)

when the data is observed as functional time series (FTS). It should be noticed that the functional time series analysis has been widely developed in the area of functional statistics; see Ferraty and Vieu [17] for an important discussions. In this paper, we carry out our functional time series framework by assuming that the sequence

{(Z_{i} = (X_{i}, Y_{i}))}_{i}

is algebraic

α

-mixing has coefficient of mixing

α (n) \to 0

such that

there exists a > 2, such that \sum_{n} n^{a} α (n) < \infty .

(2)

As for all asymptotic results, in nonparametric functional statistic, we need to control the local concentration of both marginal and joint distributions of the functional observations. Indeed, for the mariginal distribution, we assume that

ϕ_{x} (h) : = I P (X \in B a (x, h)) > 0, for any h > 0,

(3)

such that

\exists 0 < c < 1 < c^{'} < \infty satisfies lim_{r \to 0} \frac{ϕ_{x} (r c)}{ϕ_{x} (r)} < 1 < lim_{r \to 0} \frac{ϕ_{x} (r c^{'})}{ϕ_{x} (r)} .

(4)

For the joint distribution, we assume that

Ψ_{x} (r) = sup_{i \neq j} I P [(X_{i}, X_{j}) \in B a (x, r) \times B a (x, r)] > 0, for any r > 0,

(5)

where

B a (x, h) : = \{z \in F satisfies d (z, x) < r\}

refers to a ball with a center x and a radius h.

The challenging aim of the paper is to establish the convergence rate of the kNN-LLM estimator without independence assumption. Of course, this general consideration requires different tools to those used for the independent situation. In the rest of this section, we give the necessary background to handle this situation.

Lemma 1

(see Ferraty and Vieu [17]). Let

{(Z_{i})}_{i \in I N}

be an algebraic α-mixing process which is identically distributed.

1.: If there exist $p > 2$ and $M > 0$ such that, for all $t > M, I P (| Z_{1} | > t) \leq t^{- p}$ , then, for all $r \geq 1$ , $ϵ > 0$ , we have

$I P (|\sum_{i = 1}^{n} Z_{i}| > ϵ) \leq C [{(1 + \frac{ϵ^{2}}{r s_{n}^{2}})}^{- r / 2} + n r^{- 1} {(\frac{r}{ϵ})}^{(a + 1) p / (a + p)}] .$
2.: If there exists $M < \infty$ such that $| Z_{1} | \leq M$ , then, for all $r \geq 1$ and $C < \infty$ :

$I P (|\sum_{i = 1}^{n} Z_{i}| > ϵ) \leq C [{(1 + \frac{ϵ^{2}}{r s_{n}^{2}})}^{- r / 2} + n r^{- 1} {(\frac{r}{ϵ})}^{a + 1}],$

where $s_{n}^{2} = \sum_{i = 1}^{n} \sum_{j = 1}^{n} | c o v (Z_{i}, Z_{j}) |$ .

3. Results: The Asymptotic Properties of the kNN-LLM Estimator of CCDF

Now, we prove the a.co. convergence of the estimator

\tilde{F} (y | x)

toward

F (y | x)

. Firstly, let us point out that the condition (4) ensures the existence of

(α, β) \in {(0, 1)}^{2}

that satisfies

ϕ_{x}^{- 1} (\frac{k}{α n}) \leq C ϕ_{x}^{- 1} (\frac{β k}{n}) .

So, in the remainder of this paper, we put

{I H}_{k}^{r} = ϕ_{x}^{- 1} (k / α n)

,

{I H}_{k}^{l} = ϕ_{x}^{- 1} (β k / n)

,

ℓ^{r} = l / α n

and

ℓ^{l} = α l / n

. Next, to establish the convergence rate of our estimator, we set the following conditions.

The kernel

K e r

is differentiable function and has support

[0, 1]

. Moreover, its first derivative

K e r^{'}

exists and ∃ C and

C^{'}

satisfy

- \infty < C^{'} < K e r^{'} (t) < C < 0 for 0 \leq t \leq 1 .

(6)

and for

h = {I H}_{k}^{l} or {I H}_{k}^{r}

,

\exists n_{0} \exists C > 0 such that \forall n > n_{0}, - \frac{1}{ϕ_{x} (h)} \int_{- 1}^{1} ϕ_{x} (z h) \frac{\partial}{\partial z} (z^{2} K e r (z)) d z > C > 0 .

The conditional distribution function satisfies, for all

(x_{1}, x_{2}) \in N_{x} \times N_{x}

and

(y_{1}, y_{2}) \in {I R}^{2}

,

\begin{matrix} | F (y_{1} | x_{1}) - F (y_{2} | x_{2}) | \leq C (d {(x_{1}, x_{2})}^{b_{1}} + {| y_{1} - y_{2} |}^{b_{2}}) b_{1} > 0, b_{2} > 0 . \end{matrix}

(7)

The sequence

τ_{n} = max_{j = 0, 1; h = {I H}_{k}^{l}, or {I H}_{k}^{r}} V a r [\frac{1}{n ϕ_{x} (h)} \sum_{i = 1}^{n} d^{j} (x, X_{i}) K e r (h^{- 1} d (x, X_{i}))],

satisfies that

\sum_{n} n τ_{n}^{- (a + 1)} n {log}^{((a - 3) / 2)} n < \infty .

(8)

Remark 1.

The conditions (6)–(8) are quite weak in time series analysis. Indeed, we point out that most of the conditions are similar to the previous works in functional time series. On the other hand, similar conditions to (6) and (8) can be also found in Ferraty and Vieu [17]. A deeper and clear discussion on the generality of our framework is given in the next Section.

Theorem 1.

Using the approximation condition in (1) and under the conditions (6)–(8), if the derivative of the function H exists and is an increasing function satisfies

{\int | t |}^{2} H^{'} (t) d t < \infty,

then,

| \tilde{F} (y | x) - F (y | x) | = O {(\frac{l}{n})}^{b_{2}} + O (ϕ_{x}^{- 1} {(\frac{k}{n})}^{b_{1}}) + O_{a . c o .} (\sqrt{\frac{τ_{n} log n}{n^{2}}}) .

Proof of Theorem 1.

The details of the proof is given in the Supplementary File. It is based on the following Lemmas. □

Lemma 2.

Using the same conditions of Theorem 1, we have

| S_{j, n} - I E [S_{j, n}] | = O_{a . c o .} (\sqrt{\frac{τ_{n} log n}{n^{2}}}),

where

S_{j, n} = \frac{1}{n h^{j} ϕ_{x} (h)} \sum_{i = 1}^{n} d^{j} (x, X_{i}) K e r (h^{- 1} d (x, X_{i})); h = {I H}_{k}^{l}, o r {I H}_{k}^{r} a n d j = 1, 2 .

Lemma 3.

Using the same conditions of Theorem 1, we have

e_{j, n} - I E [e_{j, n}] = O_{a . c o .} (\sqrt{\frac{τ_{n} log n}{n^{2}}}),

where,

e_{j, n} = \frac{1}{n h^{j} ϕ_{x} (h)} \sum_{i = 1}^{n} d^{j} (x, X_{i}) K e r (h^{- 1} d (x, X_{i})) H (ℓ - 1 (y - Y_{i})); e l l = ℓ^{l}, o r ℓ^{r}, a n d j = 0, 1 .

Lemma 4

(see Laksaci et al. [6]). Under the conditions (1), (3), and (7), we have

I E [e_{j, n}] = O {(\frac{l}{n})}^{b_{2}} + O (ϕ_{x}^{- 1} {(\frac{k}{n})}^{b_{1}}) .

In the following, we will show the mathematical proofs of the above intermediate results. When no ambiguity is possible, C and

C^{'}

will be used to denote some strictly positive generic constants with

K_{i} (h, x) = K e r (h^{- 1} d (X_{i}, x)) H_{i} (ℓ, y) = H (ℓ^{- 1} (y - Y_{i}))

.

Proof of Theorem 2.

The proof is based on the exponential inequality of the Fuck-Nagaev (see Lemma 1) on

Z_{i}^{j} = \frac{1}{n h^{j} ϕ_{x} (h)} (d^{j} (x, X_{i}) K e r (h^{- 1} d (x, X_{i})) - I E [d^{j} (x, X_{i}) K e r (h^{- 1} d (x, X_{i}))]) .

Since

S_{j, n} = \sum_{i = 1}^{n} Z_{i}^{j},

it follows that, for all

ε, r > 0

, we obtain

\begin{matrix} I P [|S_{j, n} - I E [S_{j, n}]| > ε \sqrt{n^{- j} τ_{n} log n}] & \leq & C (A_{1} (x) + A_{2} (x)), \end{matrix}

(9)

where

A_{1} (x) = {(1 + \frac{ε^{2} n^{- 2} τ_{n} log n}{r s_{n}^{2}})}^{- r / 2} s_{n}^{2} = V a r (S_{j, n}) and A_{2} (x) = n r^{- 1} {(\frac{r}{ε \sqrt{n^{- 2} τ_{n} log n}})}^{a + 1} .

Set

r = C {(log n)}^{2}

to conclude that

\begin{matrix} A_{2} (x) \leq C n^{- a} τ_{n}^{(a + 1) / 2} {(log n)}^{(a - 3) / 2}, \end{matrix}

(10)

and use (9) to obtain

\begin{matrix} A_{1} (x) & \leq & C {(1 + \frac{ε}{log n})}^{- {(log n)}^{2} / 2} \leq e^{- (ε^{2} log n) / 2} . \end{matrix}

(11)

For a suitable choice of

ε

and by (8), we get

S_{j, n} - I E [S_{j, n}] = O_{a . c o .} (\sqrt{\frac{τ_{n} log n}{n^{2}}}) .

□

Proof of Theorem 3.

Once again, as in Lemma 2’s proof, we apply the exponential inequality of Fuck-Nagaev for another random variables as

T_{i}^{j} = \frac{1}{n h^{j} ϕ_{x} (h)} (d^{j} (x, X_{i}) K e r (h^{- 1} d (x, X_{i})) H (ℓ - 1 (y - Y_{i}))

- I E [d^{j} (x, X_{i}) K e r (h^{- 1} d (x, X_{i})) H (ℓ - 1 (y - Y_{i}))]) .

Since

e_{j, n} = \sum_{i = 1}^{n} T_{i}^{j},

we get

\begin{matrix} \sum_{n} I P [|e_{j, n} - I E [e_{j, n}]| > ε \sqrt{n^{- 2} τ_{n} log n}] < \infty, \end{matrix}

which allows to write

e_{j, n} - I E [e_{j, n}] = O_{a . c o .} (\sqrt{\frac{τ_{n} log n}{n^{2}}}) .

This last achieves the proof of this lemma. □

4. Discussions and Comments

4.1. The kNN Method in Functional Statistics

Motivated by its flexibility in practice, the kNN-method is becoming a popular nonparametric data analysis. It was introduced in functional statistics by Burba et al. [7]. The implementation of this approach in functional data analysis is promising. Indeed, as with all nonparametric smoothing approaches, the kNN method has some drawbacks in multivariate analysis, such as its high sensitivity to feature vectors, the slowness of the execution-time when the data has large volume, or the excessive use of the memory. It is well known that all these drawbacks are due to the problem of the curse of dimensionality in vectorial statistics. This problem was handled by using the small probability function

ϕ_{x}

to evaluate the asymptotic property of the estimator. Indeed as discussed in Ferraty et al. [7] the small probability function

ϕ_{x} (h) = I P (X \in B a (x, h))

quantifies the concentration property of the probability measure of the functional variable, and it has an inherent role in functional data analysis. In a sense that a less concentration of the probability measure of the functional variable implies a slower rate of convergence of the estimator. Then, the best way of resolving the above drawbacks is to increase the concentration of the functional variable in neighbor of the location point x. To do that, we use the fact that the small ball probability function depends crucially on the metric

d (., .)

. Hence, from a statistical point of view, we can increase the concentration property by choosing the best metric d. Thus, we can say that the implantation of the kNN in functional data analysis is very beneficial in practice, and their drawbacks of the multivariate can be overcome by using the appropriate topological structure.

4.2. On the Impact of This Contribution

It is well known that the conditional distribution function has a pivotal role in nonparametric statistics modeling. Indeed, the nonparametric estimation of this model is an imperative step for various nonparametric models, such as conditional density, the conditional hazard, and the conditional quantile functions. The conditional cumulative distribution function in the prediction setting allows to construct many predictive intervals or, more generally, predictive regions. We mention, for instance, the conditional percentile interval, the shortest conditional modal interval (SCMI), and the maximum conditional density region (MCDR) (see Yao and Tong [18] for their definitions). Of course, the diversity of the applicability of the conditional distribution function highlights the importance of this conditional model, which has the advantage of completely characterizing the conditional law of the considered random variables. As mentioned in the bibliographical discussion of the introduction’s section, this model has been widely studied in nonparametric functional statistics. However, the novelty of the present contribution is mainly the estimation of this model by combining two important approaches: the local linear method and the k-Nearest Neighbors procedure. This combination allows to construct a new attractive estimator that inherits the advantages for both methods. Indeed, it is well known that the local linear method improves the bias property of the kernel method, while the weighting by the kNN algorithm offers a sophisticated procedure for the smoothing parameter selection. It is locally selected with respect to the vicinity at the conditioning point, which permits to construct an adaptive estimator to the data’s local structure. Such consideration is very important in (nonparametric functional data analysis, where the performance of estimators is closely linked to the local structure of the data through concentration properties of the probability measure (see Ferraty and Vieu [17]). Nevertheless, the establishment of the asymptotic properties of this estimator is more difficult than the classical case studied by Laksaci et al. [6] because, here, the bandwidth parameter is a random variable, unlike the standard case where the bandwidth parameter is a scalar. Considering the dependent case, which is a more general and more realistic situation, this difficulty becomes more complicated. We can say that the principal axes of this contribution are (1) the conditional distribution function as a pivotal model for various nonparametric conditional models, (2) the estimation method as a new proceeder even in the nonfunctional case (as far as we know, there is no work in the CCDF estimation by combining the LLE to kNN) and (3) the functional time series case as a generalization of the independent case. To emphasize the usefulness of the present contribution in the prediction issue, we discuss in the following section how we can predict future real characteristics of a continuous-time process given its past.

4.3. Some Particular Cases

One of the main features of the present work is treating the kNN-local linear estimation under the dependent case. The latter allows for regrouping several usual situations. To highlight the importance and the generality of the present contribution, we come back to particularize our study for these usual situations. In particular, we consider the independent case, the strong local dependency case, and the local constant method. Let us note that, for the sake of brevity, detailed proof of the corollaries is given in the Supplementary File.

The independent case: The independent case is widely studied in the past for some alternative models. However, this case can be treated as a particular case for this contribution. It corresponds to the case of $α (n) = 0$ . In this situation, the condition (2) is automatically stratified, and Theorem 1 leads straightforwardly to the following Corollary.
Corollary 1.
Under the conditions (6) and (7) and if the derivative of the function H exists and is an increasing function satisfies

${\int | t |}^{2} H^{'} (t) d t < \infty,$

then,

$| \tilde{F} (y | x) - F (y | x) | = O {(\frac{l}{n})}^{b_{2}} + O (ϕ_{x}^{- 1} {(\frac{k}{n})}^{b_{1}}) + O_{a . c o .} (\sqrt{\frac{log n}{k}}) .$

We point out that this result is also new as the kNN-LLM estimator of the CCDF in the i.i.d. case.
The strong local dependency: The second particular case is the case when the local dependency, measured by $Ψ_{x} (r)$ , is of order

$Ψ_{x} (r) \leq C (ϕ_{x} {(r)}^{(a + 1) / a}) .$

Then, in this situation, the Theorem 1 is reformulated as follows.
Corollary 2.
Under the conditions (6) and (7) and if the derivative of the function H exists and is an increasing function satisfies

${\int | t |}^{2} H^{'} (t) d t < \infty,$

then,

$| \tilde{F} (y | x) - F (y | x) | = O {(\frac{l}{n})}^{b_{2}} + O (ϕ_{x}^{- 1} {(\frac{k}{n})}^{b_{1}}) + O_{a . c o .} (\sqrt{\frac{log n}{k}}) .$

Obviously, the convergence rate of this particular case is more speed than the general form given in Theorem 1.
The local constant method: It is well known that the Nadaraya-Watson estimator can be viewed as a particular case of the local linear approach. It can be obtained by taking $b = 0$ . This case is so-called local constant approach and its kNN estimator is defined by

$F_{n} (y | x) = \frac{\sum_{i = 1}^{n} K e r ({I H}_{k}^{- 1} d (x, X_{i})) H (ℓ_{l}^{- 1} (y - Y_{i}))}{\sum_{i = 1}^{n} K e r ({I H}_{k}^{- 1} d (x, X_{i}))} .$

This estimator has been studied by Karra et al. [10]. They established its asymptomatic properties when the observations are independent identically distributed. While, here, we develop the dependent case. Once again, the kNN-LCM estimator’s consistency is also new in this context of nonparametric functional data analysis. It is given in the following corollary.
Corollary 3.
Using the approximation condition in (1) and under the conditions (6)–(8), if the derivative of the function H exists and is an increasing function satisfies

${\int | t |}^{2} H^{'} (t) d t < \infty,$

then,

$| F_{n} (y | x) - F (y | x) | = O {(\frac{l}{n})}^{b_{2}} + O (ϕ_{x}^{- 1} {(\frac{k}{n})}^{b_{1}}) + O_{a . c o .} (\sqrt{\frac{τ_{n} log n}{n^{2}}}) .$

5. Real Data Applications

5.1. Application to Functional Time Series Prediction

One of the main feature of the CCDF is the possibility to construct several predictive regions

I_{ζ}

. Of course, the efficiency of each prediction interval is assessed by the means of the length of the set

I_{ζ}

and the presence of the true value in

I_{ζ}

. It is well documented that the width of the SCMI is the smallest compared to all predictive regions with the same coverage probability. It was introduced by Yao and Tong [18] and obtained by

[A_{1 - ζ}, B_{1 - ζ}] = arg min_{c, d} \{L e b [c, d] | F^{x} (d) - F^{x} (c) \geq 1 - ζ\} .

The

L e b (\cdot)

refers to the Lebesgue measure. Using the CCFD estimators, we approximate the SCMI by

[A_{1 - ζ} (X_{n}), B_{1 - ζ} (X_{n})] = arg min_{c, d} \{L e b [c, d] | \tilde{F} (d | X_{n}) - \tilde{F} (c | X_{n}) \geq 1 - ζ\} .

5.2. Example 1: Application to Climatological Time Series Data

In this first example, we show the applicability of the proposed estimators to climatological data. Indeed, we predict the monthly average temperature one year ahead in Debrecen’s station in Hungary. The link to the data is provided in the “Data Availability Statement” section. Let us note that the studied data is recently collected, and it contains only some missing values, which are replaced by the average of the values before and after the analyses. Now, from this observed data, we construct

n + 1 = 100

curves

(X_{i} (t)), i = 1, \dots, n + 1

, where

X_{i}

denotes the average temperature curve observed during the (1 year) 12 months of the i-th year. The observed data are plotted in Figure 1, representing the values of the monthly average of the temperature.

In Figure 2, we plot the

{(X_{i})}_{i}

that represents the yearly curves of the temperature.

The efficiency of the SCMI predictive interval is linked to the parameters’ choices in the estimator

\tilde{F}

. For this computational purpose, we compute

\tilde{F}

with the quadratic kernel K. The metric d is determined according to the PCA-algorithm. In this application part, we shall compare the predictive interval (SCMI

ζ = 0.1

) using the kNN-LLM estimator instated to the CKM-estimator studied by Laksaci et al. [6]. In both estimators, we choose k and h by the same cross-validation method used by De Gooijer and Gannoun [19], which is based on the following criterion:

C V = \frac{1}{n} \sum_{j = 1}^{n} \sum_{i = 1}^{n} {(1 I_{{Y_{i} \leq Y_{j}}} - {\hat{F}}^{- j} (Y_{i} | x_{j}))}^{2} .

This criterion is optimized over the same subsets of k and h, proposed by Rachdi and Vieu [17].

Now, the SCMI predictive interval of the whole curve of the last year (

i = 100

) of this sample knowing the functional covariates

X_{99}

, by commuting

\tilde{F} (\cdot | X_{99^{*}})

, where

X_{99^{*}}

is the nearest curve to

X_{99}

using the learning sample

{(Y_{i}^{j}, X_{i})}_{i = 1 \dots 99}

with

Y_{i}^{j} = X_{i + 1} (j)

, for each fixed month

j = 1, \dots, 12

. Figure 3 displays the results where we plot three curves: The true curve in the dashed line and the extremities of the SCMI- interval are given in the continuous line.

The observed data is displayed in the dashed curve and the solid curves represent the estimated values for the two extremities of the SCMI predictive interval. It is clear that the kNN-LLM is significantly better than the CKM- estimator basis one. This gain is confirmed by the average of the SCMI-length(see Table 1).

5.3. Example 2: Application to Air Quality Time Series Data

In addition to the first example that highlights the importance of the kNN approach over the classical kernel method, we emphasize in this second example the superiority of the local linear approach over the local constant approach. For this purpose, we consider a time series from air quality data. The importance of this kind of data is motivated by the fact that the air quality has a potential environmental impact on the quality of life of humans and the health of animals. In particular, it is well documented that exposures to ground ozone levels for a period of 1–8 h reduce various pulmonary functions and affect the respiratory tract’s tissues. Thus, the approximation of the excessive level of ozone concentration is a primordial subject in environmental sciences. In this example, we focus on the air quality in Westminster city in London. The time-series data was collected at the Marylebone road site by real-time monitoring. Let us point out that our study can be used to model the distribution of the ozone given the daily curves of the ocher polluting gases (carbon monoxide, carbon dioxide, Sulphur dioxide, Nitrogen dioxide, Nitric oxide, etc.). However, for the sake of shortness, we focus on two important indices of air quality that are the sulfur dioxide (SO₂) and the ozone concentration (O₃). It is well known that (SO₂) increases the stratospheric ozone concentrations when it reacts with ultraviolet rays. This example’s data is more complicated than the first example because the time series data is observed on a more fine grid. The link to this data is also provided in the “Data Availability Statement” section. Precisely, unlike the yearly curve of the first example, here, the ozone concentration is hourly observed, and the sulfur dioxide is observed by time-gride equal to 15 min. Thus, this kind of time series data is more adapted for our functional approach. It worth noting that the use of functional statistical models in this type of environmental studies has been widely studied by many authors FDA (see, for example, Quintela-del-Río and Francisco-Fernández (2011) to cite a few). In this example, we aim to analyze the relationship between the SO₂ and the O₃ in Westminster city using the SCMI predictive region. Specifically, we wish to predict the total ozone one day before using the daily curve of SO₂. Formally, we observe 364 days of air quality data

(X_{i}, Y_{i})

in Marylebone road station, where

X_{i} (.)

is the daily curve of SO₂ in day i and

Y_{i}

the total ozone of the day

i + 1

. A sample of the functional regressors is shown in Figure 4. It concerns the daily curves of the sulphur dioxide observed by a time gride equal to 15 min, and the value of SO₂ in the vertical axis was measured in term of micrograms per cubic meter (μg/m³).

We highlight the importance of the local linear approach over the classical local constant one for this actual data set. In particular, the local linear approach reduces the bias term of the local constant. So, we shall quantify this gain in practice. To do this, we compare the SCMI of both estimators: the kNN-LLM and the kNN-CKM. In addition, we keep the same strategies as those used in the first example to select the parameters involved in the estimator. More precisely, we use the quadratic kernel on

(0, 1)

, the PCA metric, and the criterion

C V

to choose the number of neighborhood k.

For our comparison purpose, we put

ζ = 0.05

and split the data sample randomly into two parts: the learning sample (260 observations) and the test sample (104 observations). Finally, we examine both estimators’ efficiency by Probability Coverage (PC), which is the main criterion to evaluate the predictive regions. In particular, we draw in the Figure 5, the PC of the testing sample obtained by the two estimation methods.

We see that the local linear estimation has better performance than that based on the local constant method. Of course, this conclusion is not surprising since it reflects the superiority of the local linear approach in the bias term. Undoubtedly, we can say that the kNN-LLM keeps its advantages over the local constant method in the functional time series case.

6. Conclusions and Perspectives

The present contribution investigates the problem of the local linear estimation of the distribution function of real random variable conditioning on a functional covariate. The main novelty of this paper is to construct an estimator using the double kernels kNN estimation procedure. The main feature of the built estimator is its smoothing property. The latter improves the estimators’ flexibility and broadens the scope of application of the conditional distribution estimation. From a theoretical point of view, the estimator’s asymptotic property is established under a more general domain called the functional time series case. Specifically, the dependence setting is modeled through the strong mixing condition. It is well known that this kind of dependency covers an extensive class of usual processes, including the AR process, ARMA process, Gaussian process, Markov process, Linear process, and m-dependent, among others. From a practical point of view, we have illustrated the constructed estimator’s feasibility using real data. The computational study shows that the proposed estimator has a good behavior as prediction models. It improves the prediction by the classical kernel method in both single predictions or by predictive region. This statement confirms the superiority of the kNN local linear estimation over the standard kernel one. Moreover, in addition to this considerable development of the nonparametric functional conditional models, the present contribution opens numerous future research tracks in nonparametric functional data analysis. For instance, it will be very interesting to establish the asymptotic normality of the proposed estimator, to consider the weak functional time series case or incomplete functional time series data case. On the other hand, the robustness of the predictors is a crucial issue in functional data analysis. At this stage, studying the consistency of the kNN-LLM estimator of the robust regression in functional time series is an important prospect of the present contribution. It permits the reduction of the sensitivity of the kNN approach to the noisy data, missing values, and the presence of outliers.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/math9101102/s1.

Author Contributions

Writing—review & editing, I.M.A., Z.C.E., A.L. and M.R. The authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Deanship of Scientific Research at King Khalid University through the Research Groups Program under grant number R.G.P. 2/82/42.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The first data that used in the first example is available at https://www.met.hu/en/eghajlat/magyarorszag_eghajlata/eghajlati_adatsorok/Debrecen/adatok/napi_adatok/index.php (accessed on 30 March 2021). The second data that used in the second example is available at https://www.airqualityengland.co.uk/site/data?site_id=MY1 (accessed on 30 March 2021).

Acknowledgments

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions which improved the quality of this article substantially. They thank and extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Fan, J.; Gijbels, I. Local Polynomial Modelling and Its Applications; Chapman & Hall: London, UK, 1996. [Google Scholar]
Baìllo, A.; Grané, A. Local linear regression for functional predictor and scalar response. J. Multivar. Anal. 2009, 100, 102–111. [Google Scholar] [CrossRef] [Green Version]
Barrientos-Marin, J.; Ferraty, F.; Vieu, P. Locally Modelled Regression and Functional Data. J. Nonparametr. Stat. 2010, 22, 617–632. [Google Scholar] [CrossRef]
Berlinet, A.; Elamine, A.; Mas, A. Local linear regression for functional data. Ann. Inst. Stat. Math. 2011, 63, 1047–1075. [Google Scholar] [CrossRef] [Green Version]
Zhou, Z.; Lin, Z. Asymptotic normality of locally modelled regression estimator for functional data. J. Nonparametr. Stat. 2016, 28, 116–131. [Google Scholar] [CrossRef]
Laksaci, A.; Rachdi, M.; Rahmani, S. Spatial modelization: Local linear estimation of the conditional distribution for functional data. Spatial Stat. 2013, 6, 1–23. [Google Scholar] [CrossRef]
Burba, F.; Ferraty, F.; Vieu, P. k-nearest neighbor method in functional non-parametric regression. J. Nonparametr. Stat. 2009, 21, 453–469. [Google Scholar] [CrossRef]
Laloë, T. A k-nearest neighbor approach for functional regression. Stat. Probab. Lett. 2008, 78, 1189–1193. [Google Scholar] [CrossRef] [Green Version]
Kudraszow, N.; Vieu, P. Uniform consistency of kNN regressors for functional variables. Stat. Probab. Lett. 2013, 83, 1863–1870. [Google Scholar] [CrossRef]
Kara, L.Z.; Laksaci, A.; Rachdi, M.; Vieu, P. Data-driven kNN estimation in nonparametric functional data analysis. J. Multivar. Anal. 2017, 153, 176–188. [Google Scholar] [CrossRef]
Chikr-Elmezouar, Z.; Almanjahie, I.M.; Laksaci, A.; Rachdi, M. FDA: Strong consistency of the kNN local linear estimation of the functional conditional density and mode. J. Nonparametr. Stat. 2019, 31, 175–195. [Google Scholar] [CrossRef]
Bachir, A.; Almanjahie, I.M.; Attouch, M.K. The k Nearest Neighbors Estimator of the M-Regression in Functional Statistics. CMC-Comput. Mater. Continua 2020, 65, 2049–2064. [Google Scholar] [CrossRef]
Laksaci, A.; Ould-Said, E.; Rachdi, M. Uniform consistency in number of neighbors of the kNN estimator of the conditional quantile model. Metrika 2021, 1–17. [Google Scholar] [CrossRef]
Rachdi, M.; Laksaci, A.; Kaid, Z.; Benchiha, A.; Al-Awadhi, F.A. k-nearest neighbors local linear regression for functional and missing data at random. Stat. Neerl. 2021, 75, 42–65. [Google Scholar] [CrossRef]
Almanjahie, I.M.; Alahmari, W.M.; Laksaci, A.; Rachdi, M. Computational aspects of the kNN local linear smoothing for some conditional models in high dimensional statistics. Commun. Stat. Simul. Comput. 2021. [Google Scholar] [CrossRef]
Almanjahie, I.M.; Elmezouar, Z.C.; Laksaci, A.; Rachdi, M. kNN local linear estimation of the conditional cumulative distribution function: Dependent functional data case. Comptes Rendus Math. 2018, 356, 1036–1039. [Google Scholar] [CrossRef]
Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis. Theory and Practice; Springer Series in Statistics; Springer: New York, NY, USA, 2006. [Google Scholar]
Yao, Q.; Tong, H. On initial-condition sensitivity and prediction in nonlinear stochastic systems. Bull. Int. Stat. Inst. 1995, 50, 395–412. [Google Scholar]
DeGooijer, J.G.; Gannoun, A. Nonparametric conditional predictive regions for time series. Comput. Stat. Data Anal. 2000, 33, 259–275. [Google Scholar] [CrossRef]

Figure 1. Monthly mean temperature.

Figure 2. Mean temperature by year.

Figure 3. Prediction and confidence bands by SCMI-rule.

Figure 4. The daily curves of the SO₂.

Figure 5. The PC of LLM (on the left) and the PC of LCM (on the right).

Table 1. The average of the length of the SCMI for kNN-LLM and CKM estimators.

	kNN-LLM Estimator	CKM Estimator
Average of the length of the SCMI	1.23	2.07

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Almanjahie, I.M.; Elmezouar, Z.C.; Laksaci, A.; Rachdi, M. Smooth kNN Local Linear Estimation of the Conditional Distribution Function. Mathematics 2021, 9, 1102. https://doi.org/10.3390/math9101102

AMA Style

Almanjahie IM, Elmezouar ZC, Laksaci A, Rachdi M. Smooth kNN Local Linear Estimation of the Conditional Distribution Function. Mathematics. 2021; 9(10):1102. https://doi.org/10.3390/math9101102

Chicago/Turabian Style

Almanjahie, Ibrahim M., Zouaoui Chikr Elmezouar, Ali Laksaci, and Mustapha Rachdi. 2021. "Smooth kNN Local Linear Estimation of the Conditional Distribution Function" Mathematics 9, no. 10: 1102. https://doi.org/10.3390/math9101102

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Smooth kNN Local Linear Estimation of the Conditional Distribution Function

Abstract

1. Introduction

1.1. Related Works

1.2. Contribution

1.3. Organization

2. Methodology

2.1. CCDF-Model and Its kNN-LLM Estimator

2.2. Functional Time Series Framework

3. Results: The Asymptotic Properties of the kNN-LLM Estimator of CCDF

4. Discussions and Comments

4.1. The kNN Method in Functional Statistics

4.2. On the Impact of This Contribution

4.3. Some Particular Cases

5. Real Data Applications

5.1. Application to Functional Time Series Prediction

5.2. Example 1: Application to Climatological Time Series Data

5.3. Example 2: Application to Air Quality Time Series Data

6. Conclusions and Perspectives

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI