An alternating nonmonotone projected Barzilai–Borwein algorithm of nonnegative factorization of big matrices

Li, Ting; Tang, Jiayi; Wan, Zhong

doi:10.1007/s10618-021-00773-5

An alternating nonmonotone projected Barzilai–Borwein algorithm of nonnegative factorization of big matrices

Published: 28 June 2021

Volume 35, pages 1972–2008, (2021)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

390 Accesses
5 Citations
2 Altmetric
Explore all metrics

Abstract

In this paper, a new alternating nonmonotone projected Barzilai–Borwein (BB) algorithm is developed for solving large scale problems of nonnegative matrix factorization. Unlike the existing algorithms available in the literature, a nonmonotone line search strategy is proposed to find suitable step lengths, and an adaptive BB spectral parameter is employed to generate search directions such that the constructed subproblems are efficiently solved. Apart from establishment of global convergence for this algorithm, numerical tests on three synthetic datasets, four public face image datasets and a real-world transcriptomic dataset are conducted to show advantages of the developed algorithm in this paper. It is concluded that in terms of numerical efficiency, noise robustness and quality of matrix factorization, our algorithm is promising and applicable to face image reconstruction, and deep mining of transcriptomic profiles of the sub-genomes in hybrid fish lineage, compared with the state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Matrix-wise $$\ell _0$$ -constrained sparse nonnegative least squares

Article 15 November 2022

Incremental Nonnegative Matrix Factorization with Sparseness Constraint for Image Representation

Accelerated anti-lopsided algorithm for nonnegative least squares

Article 01 December 2016

Availability of data and material

The data used to support the findings of this study are available from the corresponding author upon request.

Notes

References

Barzilai J, Borwein JM (1988) Two-point step length gradient methods. IMA J Numer Anal 8:141–148. https://doi.org/10.1093/imanum/8.1.141
Article MathSciNet MATH Google Scholar
Berry MW, Browne M, Langville AN, Pauca VP, Plemmons RJ (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52:155–173. https://doi.org/10.1016/j.csda.2006.11.006
Article MathSciNet MATH Google Scholar
Bertsekas DP (1999) Nonlinear programming, 2nd edn. Athena Scientific, Belmont
MATH Google Scholar
Birgin EG, Martínez JM, Raydan M (2000) Nonmonotone spectral projected gradient methods on convex sets. SIAM J Optim 10:1196–1211. https://doi.org/10.1137/S1052623497330963
Article MathSciNet MATH Google Scholar
Cai D, He XF, Han JW (2005) Document clustering using locality preserving indexing. IEEE Trans Knowl Data Eng 17:1624–1637. https://doi.org/10.1109/TKDE.2005.198
Article Google Scholar
Dai YH, Fletcher R (2005) Projected Barzilai–Borwein methods for large-scale box-constrained quadratic programming. Numer Math 100:21–47. https://doi.org/10.1007/s00211-004-0569-y
Article MathSciNet MATH Google Scholar
Dolan ED, Moré JJ (2002) Benchmarking optimization software with performance profiles. Math Program 91:201–213. https://doi.org/10.1007/s101070100263
Article MathSciNet MATH Google Scholar
Fu X, Huang K, Sidiropoulos ND, Ma WK (2019) Nonnegative matrix factorization for signal and data analytics: identifiability, algorithms, and applications. IEEE Signal Process Mag 36:59–80. https://doi.org/10.1109/MSP.2018.2877582
Article Google Scholar
Gillis N, Glineur F (2008) Nonnegative factorization and the maximum edge biclique problem. ArXiv e-prints arXiv:0810.4225
Gong PH, Zhang CS (2012) Efficient nonnegative matrix factorization via projected Newton method. Pattern Recognit 45:3557–3565. https://doi.org/10.1016/j.patcog.2012.02.037
Article MATH Google Scholar
Grippo L, Sciandrone M (2000) On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper Res Lett 26:127–136. https://doi.org/10.1016/S0167-6377(99)00074-7
Article MathSciNet MATH Google Scholar
Guan NY, Tao DC, Luo ZG, Yuan B (2012) NeNMF: an optimal gradient method for nonnegative matrix factorization. IEEE Trans Signal Process 60:2882–2898. https://doi.org/10.1109/TSP.2012.2190406
Article MathSciNet MATH Google Scholar
Hager WW, Zhang HC (2006) A new active set algorithm for box constrained optimization. SIAM J Optim 17:526–557. https://doi.org/10.1137/050635225
Article MathSciNet MATH Google Scholar
Han J, Han LX, Neumann M, Prasad U (2009a) On the rate of convergence of the image space reconstruction algorithm. Oper Matrices 3:41–58. https://doi.org/10.7153/oam-03-02
Article MathSciNet MATH Google Scholar
Han LX, Neumann M, Prasad AU (2009b) Alternating projected Barzilai–Borwein methods for nonnegative matrix factorization. Electron Trans Numer Anal 36:54–82. https://doi.org/10.1007/978-0-8176-4751-3_16
Article MathSciNet MATH Google Scholar
Hoyer PO (2004) Nonnegative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469. https://doi.org/10.1016/j.neucom.2011.09.024
Article MATH Google Scholar
Huang S, Wan Z (2017) A new nonmonotone spectral residual method for nonsmooth nonlinear equations. J Comput Appl Math 313:82–101. https://doi.org/10.1016/j.cam.2016.09.014
Article MathSciNet MATH Google Scholar
Huang S, Wan Z, Zhang J (2018) An extended nonmonotone line search technique for large-scale unconstrained optimization. J Comput Appl Math 330:586–604. https://doi.org/10.1016/j.cam.2017.09.026
Article MathSciNet MATH Google Scholar
Huang YK, Liu HW, Zhou SS (2015a) Quadratic regularization projected Barzilai–Borwein method for nonnegative matrix factorization. Data Min Knowl Discov 29:1665–1684. https://doi.org/10.1007/s10618-014-0390-x
Article MathSciNet MATH Google Scholar
Huang YK, Liu HW, Zhou SS (2015b) An efficient monotone projected Barzilai–Borwein method for nonnegative matrix factorization. Appl Math Lett 45:12–17. https://doi.org/10.1016/j.aml.2015.01.003
Article MathSciNet MATH Google Scholar
Kim DM, Sra S, Dhillon IS (2007) Fast Newton-type methods for the least squares nonnegative matrix approximation problem
Kim H, Park H (2008a) Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J Matrix Anal Appl 30:713–730. https://doi.org/10.1137/07069239X
Article MathSciNet MATH Google Scholar
Kim J, Park H (2008b) Toward faster nonnegative matrix factorization: a new algorithm and comparisons. Proc Eighth IEEE Int Conf Data Min. https://doi.org/10.1109/ICDM.2008.149
Article Google Scholar
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791. https://doi.org/10.1038/44565
Article MATH Google Scholar
Li JC, Li WB, Liu XN (2020) An adaptive nonmonotone projected Barzilai–Borwein gradient method with active set prediction for nonnegative matrix factorization. Numer Math Theor Meth Appl 13:516–538. https://doi.org/10.4208/nmtma.OA-2019-0028
Article MathSciNet MATH Google Scholar
Li T, Wan Z (2019) New adaptive Barzilai–Borwein step size and its application in solving large scale optimization problems. ANZIAM J 61:76–98. https://doi.org/10.1017/S1446181118000263
Article MathSciNet MATH Google Scholar
Li XL, Zhang W, Dong XL (2017) A class of modified FR conjugate gradient method and applications to non-negative matrix factorization. Comput Math Appl 73:270–276. https://doi.org/10.1016/j.camwa.2016.11.017
Article MathSciNet MATH Google Scholar
Li Z, Tang J, He X (2018) Robust structured nonnegative matrix factorization for image representation. IEEE Trans Neural Netw Learn Syst 29:1947–1960. https://doi.org/10.1109/TNNLS.2017.2691725
Article MathSciNet Google Scholar
Lin CJ (2007a) On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Trans Neural Netw 18:1589–1596. https://doi.org/10.1109/TNN.2007.895831
Article Google Scholar
Lin CJ (2007b) Projected gradient methods for nonnegative matrix factorization. Neural Comput 19:2756–2779. https://doi.org/10.1162/neco.2007.19.10.2756
Article MathSciNet MATH Google Scholar
Paarero P, Tapper U (1994) Positive matrix factorization: a nonnegative factor model with optimal utilization of error estimates of data values. Environmetrics 5:111–126. https://doi.org/10.1002/env.3170050203
Article Google Scholar
Parikh N, Boyd S (2014) Proximal algorithms. Found Trends Optim 1:127–239. https://doi.org/10.1561/2400000003
Article Google Scholar
Pauca VP, Piper J, Plemmons RJ (2006) Nonnegative matrix factorization for spectral data analysis. Linear Algebra Appl 416:29–47. https://doi.org/10.1016/j.laa.2005.06.025
Article MathSciNet MATH Google Scholar
Pauca VP, Shahnaz F, Berry MW, Plemmons RJ (2004) Text mining using non-negative matrix factorization
Tang JY, Wan Z (2021) Orthogonal dual graph-regularized nonnegative matrix factorization for co-clustering. J Sci Comput 87:66. https://doi.org/10.1007/s10915-021-01489-w
Article MathSciNet MATH Google Scholar
Vavasis SA (2009) On the complexity of nonnegative matrix factorization. SIAM J Optim 20:1364–1377. https://doi.org/10.1137/070709967
Article MathSciNet MATH Google Scholar
Wan Z, Guo J, Liu JJ, Liu WY (2018) A modified spectral conjugate gradient projection method for signal recovery. Signal Image Video Process 12:1455–1462. https://doi.org/10.1007/s11760-018-1300-2
Article Google Scholar
Wan Z, Tang JY, Ren L, Xiao YM, Liu SJ (2019) Optimization techniques to deeply mine the transciptomic profile of the sub-genomes in hybrid fish lineage. Front Genet 10:911. https://doi.org/10.3389/fgene.2019.00911
Article Google Scholar
Xiao J, Hu FZ, Luo KK, Li WH, Liu SJ (2016) Unique nucleolar dominance patterns in distant hybrid lineage derived from Megalobrama Amblycephala $\times $ Culter Alburnus. BMC Genet 17:150. https://doi.org/10.1186/s12863-016-0457-3
Article Google Scholar
Xu YY, Yin WT (2013) A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J Imaging Sci 6:1758–1789. https://doi.org/10.1137/120887795
Article MathSciNet MATH Google Scholar
Zdunek R, Cichocki A (2006) Non-negative matrix factorization with quasi-Newton optimization. In: Proceedings of the eighth international conference on artificial intelligence and soft computing (ICAISC 2006)
Zhang L, Liu ZH, Pu JX, Song B (2020) Adaptive graph regularized nonnegative matrix factorization for data representation. Appl Intell 50:438–447. https://doi.org/10.1007/s10489-019-01539-9
Article Google Scholar

Download references

Funding

This research is supported by the National Natural Science Foundation of China (Grant No. 71671190), the Fundamental Research Funds for the Central Universities of Central South University (Grant No. 206021706) and the Hunan Provincial Innovation Foundation For Postgraduate (Grant No. 150110022).

Author information

Authors and Affiliations

School of Mathematics and Statistics, Central South University, Hunan, Changsha, China
Ting Li & Jiayi Tang
Institute of Engineering Modelling and Scientific Computing, Central South University, Hunan, Changsha, China
Zhong Wan

Authors

Ting Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiayi Tang
View author publications
You can also search for this author in PubMed Google Scholar
Zhong Wan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z.W. conceived and designed the research plan and wrote the paper. T.L. and J.T. performed the mathematical modelling, development of the algorithm, experiments and wrote the paper.

Corresponding author

Correspondence to Zhong Wan.

Ethics declarations

Conflict of interest

We declare that all the authors have no any conflict of interest about submission and publication of this paper.

Code availability

All the computer codes used in this study are available from the corresponding author upon request.

Additional information

Responsible editor: Evangelos Papalexakis.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proof of Remark 4

Proof

From (Lin 2007b) and (Gillis and Glineur 2008), the KKT conditions of Problem (1.1) are

$$\begin{aligned} \begin{array}{lll} W_{ia}\ge 0, &{}H_{bj}\ge 0, &{}\\ \nabla _W F(W, H)_{ia}\ge 0, &{}\nabla _H F(W, H)_{bj}\ge 0,&{} \\ W_{ia}\cdot \nabla _W F(W, H)_{ia}=0, &{}H_{bj} \cdot \nabla _H F(W, H)_{bj}=0, &{}\forall ~i, a, b, j. \end{array} \end{aligned}$$

(A.1)

Equivalently, (A.1) can be rewritten as

$$\begin{aligned} \begin{array}{lll} \min \{W_{ia}, \nabla _W F(W, H)_{ia}\}=0,&\min \{H_{bj}, \nabla _H F(W, H)_{bj}=0\},&\forall ~i, a, b, j. \end{array} \end{aligned}$$

(A.2)

By the following definitions,

$$\begin{aligned} \nabla ^{P}_{W}F(W, H)_{ia}= & {} \left\{ \begin{array}{ll} \nabla _W F(W,H)_{ia}, &{} W_{ia}>0, \\ \text {min}\{0, \nabla _W F(W,H)_{ia}\},&{} W_{ia}=0, \end{array} \right. \end{aligned}$$

(A.3)

$$\begin{aligned} \nabla ^{P}_{H}F(W, H)_{bj}= & {} \left\{ \begin{array}{ll} \nabla _H F(W,H)_{bj}, &{} H_{bj}>0, \\ \text {min}\{0, \nabla _H F(W,H_{bj}\},&{} H_{bj}=0, \end{array} \right. \end{aligned}$$

(A.4)

we can prove that the KKT conditions (A.1) and equations (2.20) are equal. This completes the proof of Remark 4. $\square $

Appendix B: Proof of Theorem 1

Proof

We prove Theorem 1 by mathematical induction.

For all $t\ge 0$, from the first result in Lemma 3, it follows that $D_t$ is a sufficiently descent direction at $Z_t$. Then, similar to Lemma 1 in (Huang et al. 2018), there is a step length $\lambda _t \in \{1, \rho , \rho ^2, \ldots , \}$ such that the following inequality holds:

$$\begin{aligned} f(Z_t+\lambda _t D_t) \le f(Z_t) + \delta _t {\bar{Q}}_{t+1} \lambda _t \langle \nabla f(Z_t), D_t\rangle . \end{aligned}$$

(B.1)

It is easy to see that for $t=0$, (3.1) and (3.2) hold.

Suppose that (3.1) and (3.2) also hold for $t\in \{1, 2, \ldots , r-1\}$. Then, from Algorithm 1, it follows that $f(Z_{r}) \le C_{\ell (r)}$ if ${\bar{f}}_1 \le {\bar{C}}_1$. Otherwise, $f(Z_{r}) \le f(Z_{r-1}+{\hat{\lambda }}_2 D_{r-1}) \le C_{\ell (r-1)}$ and $\eta _{r-1}=0$, which implies $f(Z_{r}) \le \max \{C_{\ell (r-1)}, f(Z_r)\}= C_{\ell (r)}$. Consequently, $f(Z_{t})\le C_{\ell {(t)}}$ holds for all $t \in \{1, 2, \ldots , r\}$.

For $t=r$, take ${\hat{\lambda }}_1= \lambda _t$ in (B.1). Then,

$$\begin{aligned} f(Z_t+{\hat{\lambda }}_1 D_t)\le & {} f(Z_t) + \delta _t {\bar{Q}}_{t+1}{\hat{\lambda }}_1 \langle \nabla f(Z_t), D_t\rangle \nonumber \\\le & {} C_{\ell (t)}+\delta _t {\bar{Q}}_{t+1} {\hat{\lambda }}_1 \langle \nabla f(Z_t), D_t\rangle \nonumber \\\le & {} {\bar{\eta }}_t Q_t (C_{\ell (t)}-C_t)+C_{\ell (t)}+\delta _t {\bar{Q}}_{t+1} {\hat{\lambda }}_1 \langle \nabla f(Z_t), D_t\rangle . \end{aligned}$$

(B.2)

The last inequality in (B.2) is equivalent to (3.1).

Take ${\hat{\lambda }}_2=\lambda _t$. Then, it follows from (B.1) and ${\bar{Q}}_{t+1}\ge 1$ that

$$\begin{aligned} f(Z_t+{\hat{\lambda }}_2 D_t)\le & {} f(Z_t) + \delta _t {\bar{Q}}_{t+1}{\hat{\lambda }}_2 \langle \nabla f(Z_t), D_t\rangle \nonumber \\\le & {} C_{\ell (t)}+\delta _t {\bar{Q}}_{t+1} {\hat{\lambda }}_2 \langle \nabla f(Z_t), D_t\rangle \nonumber \\\le & {} C_{\ell (t)}+\delta _t {\hat{\lambda }}_2 \langle \nabla f(Z_t), D_t\rangle , \end{aligned}$$

(B.3)

where the last inequality in (B.2) indicates that (3.2) holds. Therefore, Algorithm 1 is well defined. $\square $

Appendix C: Proof of Lemma 1

Proof

From the definition of $C_{\ell (t)}$ and the inequality (3.3), it follows that

$$\begin{aligned} C_{\ell (t+1)}= & {} \displaystyle {\max _{\max \{0, t-M+2\}\le j\le t+1} \{C_j\}}\\\le & {} \max \{C_{\ell (t)}, C_{t+1}\}\\= & {} C_{\ell (t)}. \end{aligned}$$

Thus, the sequence$\{C_{\ell (t)}\}$ is nonincreasing. It directly follows that

$$\begin{aligned} f(Z_t)\le C_{\ell (t)} \le C_{\ell (0)}= f(Z_0)\le f(H_0). \end{aligned}$$

Consequently, we have $\{Z_t\} \subset L(H_0)=\{H\ge 0: f(H)\le f(H_0)\}$. The first result has been proved.

Since f is bounded below, from Theorem 2 in (Huang et al. 2018), it directly follows that the inequalities (3.4) and (3.5) hold. This ends the proof. $\square $

Appendix D: Proof of Lemma 2

Proof

Since f is continuously differentiable and $\nabla f$ is Lipschitz continuous with the Lipschitz constant L, it holds that for any $\lambda >0$,

$$\begin{aligned} f(Z_t + \lambda D_t )-f(Z_t)= & {} \int _0^{\lambda } \langle \nabla f(Z_t +t D_t), D_t\rangle dt \\= & {} \lambda \langle \nabla f(Z_t), D_t\rangle +\int _0^{\lambda } \langle \nabla f(Z_t +t D_t)-\nabla f(Z_t), D_t\rangle dt \\\le & {} \lambda \langle \nabla f(Z_t), D_t\rangle + \int _0^{\lambda } Lt\Vert D_t\Vert _F^2 dt\\= & {} \lambda \langle \nabla f(Z_t), D_t\rangle +\dfrac{L}{2}\lambda ^2 \Vert D_t\Vert _F^2. \end{aligned}$$

Consequently, by taking $\lambda =\dfrac{\lambda _t}{\rho }$, we have

$$\begin{aligned} f\left( Z_t + \dfrac{\lambda _t}{\rho } D_t \right) -f(Z_t)\le \dfrac{\lambda _t}{\rho } \langle \nabla f(Z_t), D_t\rangle +\dfrac{L}{2}\left( \dfrac{\lambda _t}{\rho }\right) ^2 \Vert D_t\Vert _F^2. \end{aligned}$$

(D.1)

On the other hand, for the obtained step length $\lambda _t$, either $\lambda _t=1$, or (3.1), or (3.2) fails at least once. That is to say,

$$\begin{aligned} f\left( Z_t + \dfrac{\lambda _t}{\rho } D_t\right)> & {} {{\bar{\eta }}}_t Q_t (C_{\ell (t)}-C_t) +C_{\ell (t)}+\delta _t {\bar{Q}}_{t+1} \dfrac{\lambda _t}{\rho } \langle \nabla f(Z_t), D_t\rangle \\\ge & {} C_{\ell (t)}+\delta _t {\bar{Q}}_{t+1} \dfrac{\lambda _t}{\rho } \langle \nabla f(Z_t), D_t\rangle \\\ge & {} f(Z_t) +\delta _t {\bar{Q}}_{t+1} \dfrac{\lambda _t}{\rho } \langle \nabla f(Z_t), D_t\rangle , \end{aligned}$$

or

$$\begin{aligned} f\left( Z_t + \dfrac{\lambda _t}{\rho } D_t\right)> & {} C_{\ell (t)}+\delta _t \dfrac{\lambda _t}{\rho } \langle \nabla f(Z_t), D_t\rangle \\\ge & {} f(Z_t)+\delta _t \dfrac{\lambda _t}{\rho } \langle \nabla f(Z_t), D_t\rangle \\\ge & {} f(Z_t) +\delta _t {\bar{Q}}_{t+1} \dfrac{\lambda _t}{\rho } \langle \nabla f(Z_t), D_t\rangle . \end{aligned}$$

Therefore,

$$\begin{aligned} f\left( Z_t + \dfrac{\lambda _t}{\rho } D_t\right) -f\left( Z_t\right) \ge \delta _t {\bar{Q}}_{t+1} \dfrac{\lambda _t}{\rho } \langle \nabla f\left( Z_t\right) , D_t\rangle . \end{aligned}$$

(D.2)

Directly from (D.1) and (D.2), it follows that

$$\begin{aligned} \dfrac{\lambda _t}{\rho } \langle \nabla f(Z_t), D_t\rangle +\dfrac{L}{2}\left( \dfrac{\lambda _t}{\rho }\right) ^2 \Vert D_t\Vert _F^2 \ge \delta _t {\bar{Q}}_{t+1} \dfrac{\lambda _t}{\rho } \langle \nabla f(Z_t), D_t\rangle . \end{aligned}$$

By nonnegativity of $\lambda _t$ and $\rho $, we know

$$\begin{aligned} \lambda _t \ge \dfrac{2(\delta _t {\bar{Q}}_{t+1}-1)\rho \langle \nabla f(Z_t), D_t\rangle }{L\Vert D_t\Vert _F^2}. \end{aligned}$$

Combining the conditions $\delta _t {\bar{Q}}_{t+1} \le \delta _{\max }$, $0< \delta _{\max }<1$, and the first result in Lemma 3, we have proved the inequality (3.6). $\square $

Appendix E: Proof of Theorem 2

Proof

By the definition of $D^{\alpha }(H)$, we have

$$\begin{aligned} D^1 (Z_t)=P(Z_t-\nabla f(Z_t))-Z_t. \end{aligned}$$

Then, $\Vert D^1 (Z_t)\Vert _F=0$ is equivalent to the termination condition of Algorithm 2.

From the second result in Lemma 3, Algorithm 2 will terminate in a finite iterations with $\epsilon =0$ if $Z_t$ is a stationary point of (2.3).

In the case that $\{Z_t\}$ is an infinite sequence, we can prove that there exists an infinite subsequence $l_1 \le l_2 \le \dots $ such that $\Vert D^1 (Z_{l_t})\Vert _F$ approaches zero as t tends to $\infty $. Take $l_t=\ell (tM)-1$, from (3.5) in Lemma 1, it follows that

$$\begin{aligned} \sum _{t=2}^{\infty } \lambda _{l_t} \left| \langle \nabla f(Z_{l_t}), D_{l_t}\rangle \right| <\infty . \end{aligned}$$

From Lemma 2, for all t, there exists a positive constant ${{\hat{\lambda }}}$ such that ${{\hat{\lambda }}} \le \lambda _{l_t} \le 1$. Consequently,

$$\begin{aligned} \lim _{t \rightarrow \infty } \left| \langle \nabla f(Z_{l_t}), D_{l_t}\rangle \right| =0. \end{aligned}$$

From the first result in Lemma 3 and $0<\alpha _{\min }\le \alpha _{l_t}\le \alpha _{\max }$, we have

$$\begin{aligned} \lim _{t \rightarrow \infty } \left\| D_{l_t}\right\| _F \le \lim _{t \rightarrow \infty } \alpha _{l_t} \left| \langle \nabla f(Z_{l_t}), D_{l_t}\rangle \right| =0, \end{aligned}$$

which implies

$$\begin{aligned} \lim _{t \rightarrow \infty } \left\| D_{l_t}\right\| _F =0. \end{aligned}$$

From P4 and P5 of Proposition 2.1 in (Hager and Zhang 2006), it follows that

$$\begin{aligned} \Vert D_{l_t}\Vert _F\ge \min \left\{ \alpha _{\min }, 1\right\} \Vert D^1 (Z_{l_t})\Vert _F. \end{aligned}$$

Then, $\Vert D_{l_t}\Vert _F=0$ implies $\Vert D^1 (Z_{l_t})\Vert _F=0$. Therefore, (3.8) holds. The proof has been completed. $\square $

Appendix F: Proof of Theorem 3

Proof

Clearly, as a modification of the block coordinate descent method with two blocks for solving nonlinear optimization problems, the search direction and the step length computed by Algorithm 3 have the same nice properties as those by the algorithms developed in (Bertsekas 1999; Grippo and Sciandrone 2000). Therefore, the convergence result of Algorithm 3 is obtained directly from Corollary 2 in (Grippo and Sciandrone 2000). $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, T., Tang, J. & Wan, Z. An alternating nonmonotone projected Barzilai–Borwein algorithm of nonnegative factorization of big matrices. Data Min Knowl Disc 35, 1972–2008 (2021). https://doi.org/10.1007/s10618-021-00773-5

Download citation

Received: 27 March 2020
Accepted: 14 June 2021
Published: 28 June 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s10618-021-00773-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An alternating nonmonotone projected Barzilai–Borwein algorithm of nonnegative factorization of big matrices

Abstract

Access this article

Similar content being viewed by others

Matrix-wise $$\ell _0$$ -constrained sparse nonnegative least squares

Incremental Nonnegative Matrix Factorization with Sparseness Constraint for Image Representation

Accelerated anti-lopsided algorithm for nonnegative least squares

Availability of data and material

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Code availability

Additional information

Publisher's Note

Appendices

Appendix A: Proof of Remark 4

Proof

Appendix B: Proof of Theorem 1

Proof

Appendix C: Proof of Lemma 1

Proof

Appendix D: Proof of Lemma 2

Proof

Appendix E: Proof of Theorem 2

Proof

Appendix F: Proof of Theorem 3

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An alternating nonmonotone projected Barzilai–Borwein algorithm of nonnegative factorization of big matrices

Abstract

Access this article

Similar content being viewed by others

Matrix-wise $$\ell _0$$ -constrained sparse nonnegative least squares

Incremental Nonnegative Matrix Factorization with Sparseness Constraint for Image Representation

Accelerated anti-lopsided algorithm for nonnegative least squares

Availability of data and material

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Code availability

Additional information

Publisher's Note

Appendices

Appendix A: Proof of Remark 4

Proof

Appendix B: Proof of Theorem 1

Proof

Appendix C: Proof of Lemma 1

Proof

Appendix D: Proof of Lemma 2

Proof

Appendix E: Proof of Theorem 2

Proof

Appendix F: Proof of Theorem 3

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation