Skip to main content
Log in

Regularized estimation for highly multivariate log Gaussian Cox processes

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Statistical inference for highly multivariate point pattern data is challenging due to complex models with large numbers of parameters. In this paper, we develop numerically stable and efficient parameter estimation and model selection algorithms for a class of multivariate log Gaussian Cox processes. The methodology is applied to a highly multivariate point pattern data set from tropical rain forest ecology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Baddeley, A., Jammalamadaka, A., Nair, G.: Multitype point process analysis of spines on the dendrite network of a neuron. J. R. Stat. Soc. Ser. C (Appl. Stat.) 63(5), 673–694 (2014)

    Article  MathSciNet  Google Scholar 

  • Chilès, J.-P., Delfiner, P.: Geostatistics: Modeling Spatial Uncertainty. Probability and Statistics. Wiley, New York (1999)

    Book  Google Scholar 

  • Choi, J., Oehlert, G., Zou, H.: A penalized maximum likelihood approach to sparse factor analysis. Stat. Interface 3(4), 429–436 (2010)

    Article  MathSciNet  Google Scholar 

  • Choiruddin, A., Coeurjolly, J.-F., Letué, F.: Convex and non-convex regularization methods for spatial point processes intensity estimation. Electron. J. Stat. 12(1), 1210–1255 (2018)

    Article  MathSciNet  Google Scholar 

  • Coeurjolly, J.-F., Møller, J., Waagepetersen, R.: A tutorial on Palm distributions for spatial point processes. Int. Stat. Rev. 85(3), 404–420 (2017)

    Article  MathSciNet  Google Scholar 

  • Condit, R.: Tropical Forest Census Plots. Springer-Verlag and R. G. Landes Company, Berlin, Germany and Georgetown, Texas (1998)

    Book  Google Scholar 

  • Condit, R., Hubbell, S.P., Foster, R.B.: Changes in tree species abundance in a neotropical forest: impact of climate change. J. Trop. Ecol. 12(2), 231–256 (1996)

    Article  Google Scholar 

  • Diggle, P., Zheng, P., Durr, P.: Nonparametric estimation of spatial segregation in a multivariate point process: bovine tuberculosis in Cornwall, UK. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 54(3), 645–658 (2005)

    Article  MathSciNet  Google Scholar 

  • Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)

    Article  Google Scholar 

  • Guan, Y.: A least-squares cross-validation bandwidth selection approach in pair correlation function estimations. Stat. Probab. Lett. 77(18), 1722–1729 (2007)

    Article  MathSciNet  Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer Series in Statistics. Springer New York Inc., New York (2013)

    Chapter  Google Scholar 

  • Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity: The Lasso and Generalizations. Monographs on Statistics and Applied Probability. Chapman & Hall/CRC Press, Boca Raton (2015)

    Book  Google Scholar 

  • Hoerl, A.E., Kennard, R.W.: Ridge regression. Encyclopedia Stat. Sci. 8 (1988)

  • Hubbell, S.P., Foster, R.B.: Diversity of canopy trees in a neotropical forest and implications for conservation. In: Sutton, S.L., Whitmore, T.C., Chadwick, A.C. (eds.) Tropical Rain Forest: Ecology and Management, pp. 25–41. Blackwell Scientific Publications, Oxford (1983)

    Google Scholar 

  • Jalilian, A., Waagepetersen, R.: Fast bandwidth selection for estimation of the pair correlation function. J. Stat. Comput. Simul. 88(10), 2001–2011 (2018)

    Article  MathSciNet  Google Scholar 

  • Jalilian, A., Guan, Y., Mateu, J., Waagepetersen, R.: Multivariate product-shot-noise Cox models. Biometrics 71(4), 1022–1033 (2015)

    Article  MathSciNet  Google Scholar 

  • Lan, G., Getzin, S., Wiegand, T., Hu, Y., Xie, G., Zhu, H., Cao, M.: Spatial distribution and interspecific associations of tree species in a tropical seasonal rain forest of China. PLoS ONE 7(9), e46074 (2012)

    Article  Google Scholar 

  • Lee, J.D., Sun, Y., Saunders, M.A.: Proximal Newton-type methods for minimizing composite functions. SIAM J. Optim. 24(3), 1420–1443 (2014)

    Article  MathSciNet  Google Scholar 

  • Møller, J., Waagepetersen, R.: Statistical Inference and Simulation for Spatial Point Processes. Chapman and Hall/CRC, Boca Raton (2003)

    Book  Google Scholar 

  • Møller, J., Waagepetersen, R.: Modern statistics for spatial point processes. Scand. J. Stat. 34(4), 643–684 (2007)

    MathSciNet  MATH  Google Scholar 

  • Møller, J., Syversveen, A.R., Waagepetersen, R.: Log Gaussian Cox processes. Scand. J. Stat. 25(3), 451–482 (1998)

    Article  MathSciNet  Google Scholar 

  • Rajala, T., Murrell, D.J., Olhede, S.C.: Detecting multivariate interactions in spatial point patterns with Gibbs models and variable selection. J. R. Stat. Soc.: Ser. C (Appl. Stat.) 67(5), 1237–1273 (2018)

    Article  MathSciNet  Google Scholar 

  • Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse-group lasso. J. Comput. Graph. Stat. 22(2), 231–245 (2013)

    Article  MathSciNet  Google Scholar 

  • Thurman, A.L., Fu, R., Guan, Y., Zhu, J.: Regularized estimating equations for model selection of clustered spatial point processes. Statistica Sinica 25(1), 173–188 (2015)

    MathSciNet  MATH  Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  • Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(1), 91–108 (2005)

    Article  MathSciNet  Google Scholar 

  • Waagepetersen, R.: An estimating function approach to inference for inhomogeneous Neyman–Scott processes. Biometrics 63(1), 252–258 (2007)

    Article  MathSciNet  Google Scholar 

  • Waagepetersen, R., Guan, Y., Jalilian, A., Mateu, J.: Analysis of multi-species point patterns using multivariate log Gaussian Cox processes. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 65(1), 77–96 (2016)

    Article  Google Scholar 

  • Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank the editor and the two reviewers for their constructive and helpful comments. The research by A. Choiruddin, F. Cuevas-Pacheco, and R. Waagepetersen is supported by The Danish Council for Independent Research | Natural Sciences, Grant DFF 7014-00074 “Statistics for point processes in space and beyond,” and by the “Centre for Stochastic Geometry and Advanced Bioimaging,” funded by Grant 8721 from the Villum Foundation. The BCI forest dynamics research project was made possible by National Science Foundation Grants to Stephen P. Hubbell: DEB-0640386, DEB-0425651, DEB-0346488, DEB-0129874, DEB-00753102, DEB-9909347, DEB-9615226, DEB-9615226, DEB-9405933, DEB-9221033, DEB-9100058, DEB-8906869, DEB-8605042, DEB-8206992, DEB-7922197, support from the Center for Tropical Forest Science, the Smithsonian Tropical Research k+1 Institute, the John D. and Catherine T. MacArthur Foundation, the Mellon Foundation, the Celera Foundation, and numerous private individuals, and through the hard work of over 100 people from 10 countries over the past two decades. The plot project is part of the Center for Tropical Forest Science, a global network of large-scale demographic tree plots. The BCI soils data set were collected and analyzed by J. Dalling, R. John, K. Harms, R. Stallard and J. Yavitt with support from NSF DEB021104, 021115, 0212284, 0212818 and OISE 0314581, STRI and CTFS. Paolo Segre and Juan Di Trani provided assistance in the field. The covariates dem, grad, mrvbf, solar and twi were computed in SAGA GIS by Tomislav Hengl (http://spatial-analyst.net/).

We thank Dr. Joseph Wright for sharing data on dispersal modes and life forms for the BCI tree species.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Achmad Choiruddin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 3106 KB)

Appendices

Proximal Newton method

Suppose we want to find the solution of

$$\begin{aligned} \min _{\varvec{\theta } \in \mathbb {R}^n} f(\varvec{\theta }):= a(\varvec{\theta }) + c(\varvec{\theta }), \end{aligned}$$
(14)

where the function \(f(\cdot )\) can be separated into two parts: the function \(a(\cdot )\) which is a convex and twice continuously differentiable loss function and the function \(c(\cdot )\) which is a convex but not necessarily differentiable penalty function. The proximal Newton method is an iterative optimization algorithm that uses a quadratic approximation of the differentiable part \(a(\cdot )\):

$$\begin{aligned} f(\varvec{\theta }) \approx&\; {{\hat{f}}}(\varvec{\theta }) \nonumber \\ =&\; {{\hat{a}}}(\varvec{\theta }) + c(\varvec{\theta }) \nonumber \\ =&\; a(\varvec{\theta ^{(k)}}) + \nabla a(\varvec{\theta ^{(k)}})^{\text{ T }}(\varvec{\theta } - \varvec{\theta ^{(k)}}) \nonumber \\&+ (\varvec{\theta } - \varvec{\theta ^{(k)}})^{\text{ T }}H(\varvec{\theta ^{(k)}}) (\varvec{\theta } - \varvec{\theta ^{(k)}}) + c(\varvec{\theta }) , \end{aligned}$$
(15)

where \(\varvec{\theta }^{(k)}\) is the current value of \(\varvec{\theta }\), \(\nabla a(\cdot )\) is the first derivative of \(a(\cdot )\) and \(H(\cdot )\) is an approximation to the Hessian matrix \(\nabla ^2 a(\cdot )\). Letting \({\tilde{\varvec{\theta }}} = \mathop {\hbox {arg min}}\limits _{\varvec{\theta }} {{\hat{f}}}(\varvec{\theta })\), the next value of \(\varvec{\theta }\) is obtained as

$$\begin{aligned} \varvec{\theta }^{(k+1)}=\varvec{\theta }^{(k)}+ t ({\tilde{\varvec{\theta }}} - \varvec{\theta }^{(k)}) \end{aligned}$$

for some \(t>0\). That is, \({\tilde{\varvec{\theta }}}\) is used to construct a search direction for the \(k+1\)th value of \(\varvec{\theta }\). Theoretical results in Lee et al. (2014) show that t can be chosen so that \(f(\varvec{\theta }^{(k+1)})<f(\varvec{\theta }^{(k)})\). The matrix \(H(\cdot )\) can be chosen in various ways, see Lee et al. (2014) and Hastie et al. (2015) for more details.

In the following sections, we adapt the proximal Newton method to minimization of our objective function.

1.1 Quadratic approximation for updating \(\varvec{\alpha }_{i\cdot }\)

Let us first regard (8) as a function of \(\varvec{\alpha }_{i \cdot }\),

$$\begin{aligned} Q_{\lambda ,i}(\varvec{\alpha }_{i \cdot },\sigma ^2_i) =&\; 2 \sum _{\begin{array}{c} j=1\\ j \ne i \end{array}}^p \Vert Y_{ij}- {\tilde{X}}_{ij}\varvec{\alpha }_{i \cdot } \Vert ^2 \nonumber \\&+ \Vert Y_{ii}- X_{ii} \varvec{\beta }_{ii}(\varvec{\alpha },\varvec{\sigma }^2) \Vert ^2 + \lambda \sum _{l=1}^q p(\alpha _{il}) \nonumber \\ =&\; a(\varvec{\alpha }_{i \cdot }) + b(\varvec{\alpha }_{i \cdot }) + c(\varvec{\alpha }_{i \cdot }). \end{aligned}$$
(16)

To minimize (8), we consider the proximal Newton method stated in (15). In particular, we approximate \(b(\varvec{\alpha }_{i \cdot })\) by a quadratic approximation around the current value \(\varvec{\alpha }_{i \cdot }^{(k)}\):

$$\begin{aligned} b(\varvec{\alpha }_{i \cdot }) \approx&\; {{\hat{b}}}(\varvec{\alpha }_{i \cdot }) \nonumber \\ =&\; b(\varvec{\alpha }_{i \cdot }^{(k)})+ \nabla b(\varvec{\alpha }_{i \cdot }^{(k)})^{\text{ T }}(\varvec{\alpha }_{i\cdot } -\varvec{\alpha }_{i \cdot }^{(k)}) \nonumber \\&+ \frac{1}{2} (\varvec{\alpha }_{i\cdot }-\varvec{\alpha }_{i \cdot }^{(k)})^{\text{ T }}H(\varvec{\alpha }_{i \cdot }^{(k)}) (\varvec{\alpha }_{i\cdot } -\varvec{\alpha }_{i \cdot }^{(k)}) . \end{aligned}$$
(17)

Here, the first derivative is

$$\begin{aligned} \nabla b(\varvec{\alpha }_{i \cdot }^{(k)})= -4 D(\varvec{\alpha }_{i \cdot }^{(k)}) X_{ii,\cdot (1:q)}^{\text{ T }}\Big (Y_{ii}- X_{ii} \varvec{\beta }_{ii}(\varvec{\alpha }^{(k)},\varvec{\sigma }^2)\Big ) \end{aligned}$$

while \(H(\varvec{\alpha }_{i \cdot }^{(k)})\) is an approximation of the second derivative,

$$\begin{aligned} \nabla ^2 b(\varvec{\alpha }_{i \cdot }^{(k)})= 8 D(\varvec{\alpha }_{i \cdot }^{(k)}) X_{ii,\cdot (1:q)}^{\text{ T }}X_{ii,\cdot (1:q)} D(\varvec{\alpha }^{(k)}_{i \cdot }) - C(\varvec{\alpha }^{(k)}_{i \cdot }), \end{aligned}$$

where \(D(\varvec{\alpha }_{i \cdot }^{(k)})=\text {Diag}(\alpha _{i1}^{(k)},\ldots ,\alpha _{iq}^{(k)})\), \(X_{ii,\cdot (1:q)}\) denotes the first q columns in \(X_{ii}\), and \(C(\varvec{\alpha }^{(k)}_{i \cdot })=4 \text {Diag}(X_{ii,\cdot (1:q)}^{\text{ T }}(Y_{ii}- X_{ii} \varvec{\beta }_{ii}(\varvec{\alpha }^{(k)},\varvec{\sigma }^2)))\). Specifically,

$$\begin{aligned} H(\varvec{\alpha }_{i \cdot }^{(k)})&= 8 D(\alpha ^{(k)}_{i \cdot }) X_{ii,\cdot (1:q)}^{\text{ T }}X_{ii,\cdot (1:q)} D(\alpha ^{(k)}_{i \cdot }) \\&\approx \nabla ^2 b(\varvec{\alpha }_{i \cdot }^{(k)}). \end{aligned}$$

To ease the presentation and computation, we write \({{\hat{b}}}(\varvec{\alpha }_{i \cdot })\) from (17) in the form of a least squares problem

$$\begin{aligned} {{\hat{b}}}(\varvec{\alpha }_{i \cdot }) =&\; \Vert Y_{ii}- X_{ii} \varvec{\beta }_{ii}(\varvec{\alpha }^{(k)},\varvec{\sigma }^2) \Vert ^2 \\&- 2 \Big (Y_{ii}- X_{ii} \varvec{\beta }_{ii}(\varvec{\alpha }^{(k)},\varvec{\sigma }^2) )\Big )^{\text{ T }}\\&\times [2X_{ii,\cdot (1:q)} D(\varvec{\alpha }_{i \cdot }^{(k)})] (\varvec{\alpha }_{i\cdot } -\varvec{\alpha }_{i \cdot }^{(k)}) \\&+ \frac{1}{2} (2) (\varvec{\alpha }_{i\cdot } -\varvec{\alpha }_{i \cdot }^{(k)})^{\text{ T }}[2 D(\varvec{\alpha }_{i \cdot }^{(k)}) X_{ii,\cdot (1:q)}^{\text{ T }}] \\&\times [2X_{ii,\cdot (1:q)} D(\varvec{\alpha }_{i \cdot }^{(k)})] (\varvec{\alpha }_{i\cdot } -\varvec{\alpha }_{i \cdot }^{(k)}) \\ =&\; \mathbf {v}^{\text{ T }}\mathbf {v} - 2\mathbf {v}^{\text{ T }}X^*_{ii}\varvec{\gamma } + \varvec{\gamma }^{\text{ T }}(X^*_{ii})^{\text{ T }}X^*_{ii}\varvec{\gamma }\\ =&\; \Vert \mathbf {v} - X^*_{ii}\varvec{\gamma }\Vert ^2 \\ =&\; \Vert Y^*_{ii} - X^*_{ii} \varvec{\alpha }_{i \cdot }\Vert ^2 \end{aligned}$$

where

$$\begin{aligned} \mathbf {v}&= Y_{ii}- X_{ii} \varvec{\beta }_{ii}(\varvec{\alpha }^{(k)},\varvec{\sigma }^2) ,\\ X^*_{ii}&= 2X_{ii,\cdot (1:q)} D(\varvec{\alpha }_{i \cdot }^{(k)}), \\ \varvec{\gamma }&= \varvec{\alpha }_{i\cdot } -\varvec{\alpha }_{i \cdot }^{(k)}, \\ Y^*_{ii}&= Y_{ii}+ X_{ii,\cdot (1:q)}\varvec{\alpha }_{i \cdot }^{2,(k)} - X_{ii,\cdot (q+1)}\sigma ^{2}_i. \end{aligned}$$

Replacing b in (16) with \({{\hat{b}}}\), we obtain the approximate objective function \({{\hat{Q}}}_{\lambda ,i}(\varvec{\alpha }_{i \cdot }|\varvec{\alpha }_{i \cdot }^{(k)})\) given in (9). Since (9) is a standard regularized least squares problem, we minimize (9) using a coordinate descent algorithm to obtain \({{\hat{\varvec{\alpha }}}}_{i \cdot }\) as detailed in Sect. B.2.

1.2 Theoretical result regarding proximal Newton update

Let \(\varDelta (\varvec{\alpha }_{i \cdot }^{(k)})={{\hat{\varvec{\alpha }}}}_{i \cdot }-\varvec{\alpha }_{i \cdot }^{(k)}\) where \({{\hat{\varvec{\alpha }}}}_{i \cdot }\) is the minimizer of (9) and according to a line search strategy let

$$\begin{aligned} \varvec{\alpha }_{i \cdot }^{(k+1)}= \varvec{\alpha }_{i \cdot }^{(k)}+ t\varDelta (\varvec{\alpha }_{i \cdot }^{(k)}) \end{aligned}$$

for some \(t>0\). Following the proof of Proposition 2.3 in Lee et al. (2014), we can verify the following theorem.

Theorem 1

Let \( H(\varvec{\alpha }_{i \cdot }^{(k)})=8 D(\varvec{\alpha }_{i \cdot }^{(k)}) X_{ii}^{\text{ T }}X_{ii} D(\varvec{\alpha }_{i \cdot }^{(k)})\). Then

$$\begin{aligned} Q_{i,\lambda }(\varvec{\alpha }_{i \cdot }^{(k+1)},\sigma ^2_i) \le&\; Q_{i,\lambda } (\varvec{\alpha }_{i \cdot }^{(k)},\sigma ^2_i) \\&- t \varDelta (\varvec{\alpha }_{i \cdot }^{(k)})^{\text{ T }}H(\varvec{\alpha }_{i \cdot }^{(k)}) \varDelta (\varvec{\alpha }_{i \cdot }^{(k)}) + O(t^2). \end{aligned}$$

Thus, by Theorem 1, if \(H(\varvec{\alpha }_{i \cdot }^{(k)})\) is positive definite, we can choose \(t>0\) so that \(Q_{i,\lambda }(\varvec{\alpha }_{i \cdot }^{(k+1)},\sigma ^2_i)< Q_{i,\lambda }(\varvec{\alpha }_{i \cdot }^{(k)},\sigma ^2_i)\). That is, the update of \(\varvec{\alpha }_{i \cdot }\) results in a decrease in the objective function (8).

Algorithm

In our block descent algorithm, we minimize (7) with respect to \( \varvec{\sigma }^2, \varvec{\alpha }, \varvec{\phi }\), and \(\varvec{\psi }\) in turn. For \(i=1,\ldots ,p\), we first update \(\sigma ^2_i\) by minimizing (8) using least squares estimation followed by an update of \(\varvec{\alpha }_{i\cdot }\) by minimizing (9) using a coordinate descent method. We denote by \(X_{ij,\cdot k}\) the kth column of \(X_{ij}\) for \(k=1,\ldots ,q\) (\(i \ne j\)) or \(k=1,\ldots ,q+1\) (\(i=j\)). We detail, respectively in Appendices B.1 and B.2, the updates of \(\sigma ^2_i\) and the coordinate descent updates of \(\alpha _{il}\) for \(l=1,\ldots ,q\). A summary of the final algorithm is given by Appendix B.3.

1.1 Update of \(\sigma _i^2\)

The parameter \({{\hat{\sigma }}}_i^2\) is updated using least squares methods. More precisely, the gradient of (8) with respect to \(\sigma _i^2\) is

$$\begin{aligned} \frac{\partial Q_{\lambda ,i}(\varvec{\alpha }_{i \cdot },\sigma _i^2)}{\partial \sigma _i^2} =&\; -2 X_{ii,\cdot (q+1)}^{\text{ T }}(Y_{ii}- X_{ii} \varvec{\beta }_{ii}(\varvec{\alpha },\varvec{\sigma }^2) ). \end{aligned}$$

By solving \(\frac{\partial Q_{\lambda ,i}(\varvec{\alpha }_{i \cdot },\sigma _i^2)}{\partial \sigma _i^2} = 0\), we obtain the update

$$\begin{aligned} \sigma _i^2 \leftarrow \max \left\{ \frac{ X_{ii,\cdot (q+1)}^{\text{ T }}\left( Y_{ii}- \sum _{l=1}^q X_{ii,\cdot l} \alpha ^2_{i l} \right) }{X_{ii,\cdot (q+1)}^{{\text{ T }}} X_{ii,\cdot (q+1)}} ,0 \right\} \end{aligned}$$
(18)

where \(\max \{a,0\}\) is used to avoid negative results of the update.

1.2 Update of \(\alpha _{il}\)

Let \( r_{ij}= Y^*_{ij} - \sum _{\begin{array}{c} k=1\\ k \ne l \end{array}}^q X^*_{ij,\cdot k} \alpha _{ik}\), where \(Y^*_{ij}\) and \(X^*_{ij}\) are specified in (10). Then we rewrite (9) as

$$\begin{aligned} {{\hat{Q}}}_{\lambda ,i}(\varvec{\alpha }_{i\cdot }) =&\; \sum _{\begin{array}{c} j=1 \end{array}}^p \Vert r_{ij}- X^*_{ij,\cdot l} \alpha _{il}\Vert ^2 \\&+ \lambda \sum _{\begin{array}{c} k=1\\ k \ne l \end{array}}^q \Big ( (1-\xi )\frac{1}{2}\alpha _{ik}^2 + \xi | \alpha _{ik}|\Big ) \\&+ \lambda \Big ((1-\xi )\frac{1}{2}\alpha _{il}^2 + \xi | \alpha _{il}| \Big ). \end{aligned}$$

The gradient with respect to \(\alpha _{il}\) is

$$\begin{aligned} \frac{\partial {{\hat{Q}}}_{\lambda ,i}(\alpha _{il})}{\partial \alpha _{il}}&= \; -2 \sum _{\begin{array}{c} j=1 \end{array}}^p (X^*_{ij,\cdot l})^{\text{ T }}(r_{ij}- X^*_{ij,\cdot l} \alpha _{il}) \\&+ \lambda \Big ((1-\xi )\alpha _{il} + \xi {{\,\mathrm{sign}\,}}(\alpha _{il}) \Big ). \end{aligned}$$

Following the main argument by Friedman et al. (2010), the coordinate-wise update for \(\alpha _{il}\) is of the form

$$\begin{aligned} \alpha _{il} \leftarrow \frac{S \left( 2 \sum _{\begin{array}{c} j=1 \end{array}}^p (X^*_{ij,\cdot l})^{\text{ T }}r_{ij},\lambda \xi \right) }{2 \sum _{\begin{array}{c} j=1 \end{array}}^p (X^*_{ij,\cdot l})^{\text{ T }}X^*_{ij,\cdot l} + \lambda (1-\xi )}, \end{aligned}$$
(19)

where \( S(A,\lambda \xi ) =\text {sign}(A)(|A|-\lambda \xi )_+\).

1.3 Algorithm to update \(\varvec{\alpha }, \varvec{\sigma }^2, \varvec{\phi }, \varvec{\psi }\)

For a given q and sequence of \(\lambda \) values \(0 \le \lambda _1,\ldots ,\lambda _M\), the overall procedure to estimate the parameters: \(\varvec{\alpha }, \varvec{\sigma }^2, \varvec{\phi }, \varvec{\psi }\) is described by Algorithm 1. Note that estimates obtained with \(\lambda _{s-1}\) are used as initial values for the estimation with \(\lambda _{s}\), \(s=2,\ldots ,M\).

figure a

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Choiruddin, A., Cuevas-Pacheco, F., Coeurjolly, JF. et al. Regularized estimation for highly multivariate log Gaussian Cox processes. Stat Comput 30, 649–662 (2020). https://doi.org/10.1007/s11222-019-09911-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-019-09911-y

Keywords

Navigation