Previous Articles Next Articles
Ernest K. Ryu, Wotao Yin
Ernest K. Ryu, Wotao Yin. PROXIMAL-PROXIMAL-GRADIENT METHOD[J]. Journal of Computational Mathematics, 2019, 37(6): 778-812.
[1] T. Arjevani and O. Shamir, Communication complexity of distributed convex learning and optimization, In NIPS, (2015), 1756-1764.[2] N.S. Aybat, Z. Wang, T. Lin, and S. Ma, Distributed linearized alternating direction method of multipliers for composite convex consensus optimization, IEEE Transactions on Automatic Control, 63:1(2018), 5-20.[3] J.B. Baillon and G. Haddad, Quelques propriétés des opérateurs angle-bornés et n-cycliquement monotones, Israel Journal of Mathematics, 26:2(1977), 137-150.[4] H.H. Bauschke and P.L. Combettes, The Baillon-Haddad Theorem Revisited, 17:3-4(2010), 781-787.[5] H.H. Bauschke and P.L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2011.[6] D.P. Bertsekas, Incremental proximal methods for large scale convex optimization, Mathematical Programming, 129:2(2011), 163-195.[7] P. Bianchi, Ergodic convergence of a stochastic proximal point algorithm, SIAM Journal on Optimization, 26:4(2016), 2235-2260.[8] J. Bien, J. Taylor, and R. Tibshirani, A lasso for hierarchical interactions, The Annals of Statistics, 41:3(2013), 1111-1141.[9] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends in Machine Learning, 3:1(2011), 1-122.[10] V. Chandrasekaran, S. Sanghavi, P.A. Parrilo, and A.S. Willsky, Rank-sparsity incoherence for matrix decomposition, SIAM Journal on Optimization, 21:2(2011), 572-596.[11] X. Chen, Q. Lin, S. Kim, J.G. Carbonell, and E.P. Xing, Smoothing proximal gradient method for general structured sparse regression, The Annals of Applied Statistics, 6:2(2012), 719-752.[12] P.L. Combettes and J.C. Pesquet, Proximal Splitting Methods in Signal Processing, (2011), 185-212.[13] P.L. Combettes and J.C. Pesquet, Stochastic quasi-Fejér block-coordinate fixed point iterations with random sweeping, SIAM Journal on Optimization, 25:2(2015), 1221-1248.[14] P.L. Combettes and V.R. Wajs, Signal recovery by proximal forward-backward splitting, Multiscale Modeling and Simulation, 4:4(2005), 1168-1200.[15] C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, 20:3(1995), 273-297.[16] Damek Davis and Wotao Yin, A three-operator splitting scheme and its optimization applications, Set-Valued and Variational Analysis, Jun 2017.[17] A. Defazio, F. Bach, and A. Lacoste-Julien, SAGA:A fast incremental gradient method with support for non-strongly convex composite objectives, In NIPS, (2014), 1646-1654.[18] A. Defazio, J. Domke, and T.S. Caetano, Finito:A faster, permutable incremental gradient method for big data problems, In ICML, 32(2014), 1125-1133.[19] W. Deng, M.J. Lai, Z. Peng, and W. Yin, Parallel multi-block ADMM with o(1/k) convergence, Journal of Scientific Computing, 2016.[20] R.E. Fan, K.W. Chang, C.J. Hsieh, X.R. Wang, and C.J. Lin, LIBLINEAR:A library for large linear classification, Journal of Machine Learning Research, 9(2008), 1871-1874.[21] D. Gabay and B. Mercier, A dual algorithm for the solution of nonlinear variational problems via finite element approximation, Computers and Mathematics with Applications, 2:1(1976), 17-40.[22] R. Glowinski and A. Marroco, Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problmes de Dirichlet non linéaires. Revue Française d'Automatique, Informatique, Recherche Opérationnelle. Analyse Numérique, 9:2(1975), 41-76.[23] D. Hallac, J. Leskovec, and S. Boyd, Network lasso:Clustering and optimization in large graphs, In KDD, (2015), 387-396.[24] H. Hoefling, A path algorithm for the fused lasso signal approximator, Journal of Computational and Graphical Statistics, 194(2010), 984-1006.[25] M. Jaggi, V. Smith, M. Takac, J. Terhorst, S. Krishnan, T. Hofmann, and M.I. Jordan, Communication-efficient distributed dual coordinate ascent, In NIPS. 2014.[26] R. Jenatton, J.Y. Audibert, and F. Bach, Structured variable selection with sparsity-inducing norms. Journal of Machine Learning Research, 12(2011), 2777-2824.[27] R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, In NIPS, (2013), 315-323.[28] S. Kim and E.P. Xing, Tree-guided group lasso for multi-task regression with structured sparsity, In ICML, (2010), 543-550.[29] B. Kulis and P.L. Bartlett, Implicit online learning, In ICML, (2010), 575-582.[30] G. Lan and Y. Zhou, An optimal randomized incremental gradient method, 2015.[31] J. Langford, L. Li, and T. Zhang, Sparse online learning via truncated gradient, Journal of Machine Learning Research, 10(2009), 777-801.[32] N. Le Roux, M. Schmidt, and F. Bach, Stochastic gradient method with an exponential convergence rate for finite training sets, In NIPS, 2012.[33] J. Liu and J. Ye, Moreau-Yosida regularization for grouped tree structure learning, In NIPS, (2010), 1459-1467.[34] J. Liu, L. Yuan, and J. Ye, An efficient algorithm for a class of fused lasso problems, In KDD, (2010), 323-332.[35] J. Mairal, Optimization with first-order surrogate functions, In ICML, (2013), 783-791.[36] J. Mairal, Incremental majorization-minimization optimization with application to large-scale machine learning, SIAM Journal on Optimization, 25:2(2015), 829-855.[37] J. Mairal, R. Jenatton, F.R. Bach, and G.R. Obozinski, Network flow algorithms for structured sparsity, In NIPS, (2010), 1558-1566.[38] P. McCullagh and J.A. Nelder, Generalized linear models, 2nd edition, 1989.[39] H.B. McMahan, Follow-the-regularized-leader and mirror descent:Equivalence theorems and l1 regularization, In AISTATS, 2011.[40] G.J. Minty, Monotone (nonlinear) operators in Hilbert space, Duke Mathematical Journal, 29:3(1962), 341-346.[41] J.F.C. Mota, J.M.F. Xavier, P.M.Q. Aguiar, and M. Püschel, D-ADMM:A communicationefficient distributed algorithm for separable optimization, IEEE Transactions on Signal Processing, 61:10(2013), 2718-2723.[42] A. Nitanda, Stochastic proximal gradient descent with acceleration techniques, In NIPS, (2014), 1574-1582.[43] G. Nowak, T. Hastie, J.R. Pollack, and R. Tibshirani, A fused lasso latent feature model for analyzing multi-sample aCGH data, Biostatistics, 12:4(2011), 776-791.[44] D.P. Palomar and M. Chiang, Alternative distributed algorithms for network utility maximization:Framework and applications, IEEE Transactions on Automatic Control, 52:12(2007), 2254-2269.[45] N. Parikh and S. Boyd, Proximal algorithms, Foundations and Trends in Optimization, 1:3(2014), 127-239.[46] G.B. Passty, Ergodic convergence to a zero of the sum of monotone operators in Hilbert space, Journal of Mathematical Analysis and Applications, 72:2(1979), 383-390.[47] H. Raguet, J. Fadili, and G. Peyré, A generalized forward-backward splitting, SIAM Journal on Imaging Sciences, 6:3(2013), 1199-1226.[48] F. Rapaport, E. Barillot, and J.P. Vert, Classification of arrayCGH data using fused SVM, Bioinformatics, 24:13(2008), i375-i382.[49] H. Robbins and D. Siegmund, A convergence theorem for non negative almost supermartingales and some applications, In Herbert Robbins Selected Papers, (1985), 111-135.[50] E.K. Ryu and S. Boyd, Stochastic proximal iteration:A non-asymptotic improvement upon stochastic gradient descent, 2014.[51] E.K. Ryu and S. Boyd, Primer on monotone operator methods, Applied and Computational Mathematics, 15:1(2016), 3-43.[52] A. Salim, P. Bianchi, W. Hachem, and J. Jakubowicz, A stochastic proximal point algorithm for total variation regularization over large scale graphs, In IEEE CDC, (2016), 4490-4495.[53] M. Schmidt, N. Le Roux, and F. Bach, Minimizing finite sums with the stochastic average gradient, Mathematical Programming, (2016), 1-30.[54] S. Shalev-Shwartz, SDCA without duality, regularization and individual convexity, In ICML, (2016), 747-754.[55] S. Shalev-Shwartz and T. Zhang, Stochastic dual coordinate ascent methods for regularized loss, Journal of Machine Learning Research, 14(2013), 567-599.[56] S. Shalev-Shwartz and T. Zhang, Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization, Mathematical Programming, 155:1(2016), 105-145.[57] O. Shamir, N. Srebro, and T. Zhang, Communication-efficient distributed optimization using an approximate Newton-type method, In ICML, 2014.[58] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight, Sparsity and smoothness via the fused lasso, Journal of the Royal Statistical Society:Series B, 67:1(2005), 91-108.[59] R. Tibshirani and P. Wang, Spatial smoothing and hot spot detection for CGH data using the fused lasso, Biostatistics, 9:1(2008), 18.[60] P. Toulis and E.M. Airoldi, Asymptotic and finite-sample properties of estimators based on stochastic gradients, Annals of Statistics (To appear), 2017.[61] P. Toulis, J. Rennie, and E.M. Airoldi, Statistical analysis of stochastic gradient methods for generalized linear models, In ICML, (2014), 667-675.[62] M. Wang and D.P. Bertsekas, Incremental constraint projection methods for variational inequalities, Mathematical Programming, 150:2(2015), 321-363.[63] M. Wang and D.P. Bertsekas, Stochastic first-order methods with random constraint projection, SIAM Journal on Optimization, 26:1(2016), 681-717.[64] L. Xiao and T. Zhang, A proximal stochastic gradient method with progressive variance reduction, SIAM Journal on Optimization, 24:4(2014), 2057-2075.[65] E. Yang, A. Lozano, and P. Ravikumar, Elementary estimators for high-dimensional linear regression, In ICML, (2014), 388-396.[66] T. Yang, Trading computation for communication:Distributed stochastic dual coordinate ascent, In NIPS. 2013.[67] G.B. Ye and X. Xie, Split Bregman method for large scale fused lasso. Computational Statistics & Data Analysis, 55:4(2011), 1552-1569.[68] L. Yuan, J. Liu, and J. Ye, Efficient methods for overlapping group lasso, In NIPS, (2011), 352-360.[69] M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society:Series B, 68:1(2006), 49-67.[70] L. Zhang, M. Mahdavi, and R. Jin, Linear convergence with condition number independent access of full gradients, In NIPS, (2013), 980-988.[71] Y. Zhang and L. Lin, Disco:Distributed optimization for self-concordant empirical loss, In ICML, 2015.[72] Y. Zhang, M.J. Wainwright, and J.C. Duchi, Communication-efficient algorithms for statistical optimization. In NIPS, (2012), 1502-1510.[73] P. Zhao, G. Rocha, and B. Yu, The composite absolute penalties family for grouped and hierarchical variable selection, The Annals of Statistics, 6A (2009), 3468-3497.[74] J. Zhou, J. Liu, V.A. Narayan, and J. Ye, Modeling disease progression via fused sparse group lasso, In KDD, (2012), 1095-1103. |
[1] | Xiaoyu Wang, Ya-xiang Yuan. STOCHASTIC TRUST-REGION METHODS WITH TRUST-REGION RADIUS DEPENDING ON PROBABILISTIC MODELS [J]. Journal of Computational Mathematics, 2022, 40(2): 294-334. |
[2] | Weichao Kong, Jianjun Wang, Wendong Wang, Feng Zhang. ENHANCED BLOCK-SPARSE SIGNAL RECOVERY PERFORMANCE VIA TRUNCATED ?2/?1-2 MINIMIZATION [J]. Journal of Computational Mathematics, 2020, 38(3): 437-451. |
[3] | Ji Li, Jian-Feng Cai, Hongkai Zhao. ROBUST INEXACT ALTERNATING OPTIMIZATION FOR MATRIX COMPLETION WITH OUTLIERS [J]. Journal of Computational Mathematics, 2020, 38(2): 337-354. |
[4] | Zorana Lužanin, Irena Stojkovska, Milena Kresoja. DESCENT DIRECTION STOCHASTIC APPROXIMATION ALGORITHM WITH ADAPTIVE STEP SIZES [J]. Journal of Computational Mathematics, 2019, 37(1): 76-94. |
[5] | Yunfeng Cai, Zhaojun Bai, John E. Pask, N. Sukumar. CONVERGENCE ANALYSIS OF A LOCALLY ACCELERATED PRECONDITIONED STEEPEST DESCENT METHOD FOR HERMITIAN-DEFINITE GENERALIZED EIGENVALUE PROBLEMS [J]. Journal of Computational Mathematics, 2018, 36(5): 739-760. |
[6] | Lei Yang, Junhui Wang, Shiqian Ma. REDUCED-RANK MODELING FOR HIGH-DIMENSIONAL MODEL-BASED CLUSTERING [J]. Journal of Computational Mathematics, 2018, 36(3): 426-440. |
[7] | Cong D. Dang, Guanghui Lan, Zaiwen Wen. LINEARLY CONVERGENT FIRST-ORDER ALGORITHMS FOR SEMIDEFINITE PROGRAMMING [J]. Journal of Computational Mathematics, 2017, 35(4): 452-468. |
[8] | Jae Kyu Choi, Bin Dong, Xiaoqun Zhang. LIMITED TOMOGRAPHY RECONSTRUCTION VIA TIGHT FRAME AND SIMULTANEOUS SINOGRAM EXTRAPOLATION [J]. Journal of Computational Mathematics, 2016, 34(6): 575-589. |
[9] | Jinyan Fan, Jianyu Pan, Hongyan Song. A RETROSPECTIVE TRUST REGION ALGORITHM WITH TRUST REGION CONVERGING TO ZERO [J]. Journal of Computational Mathematics, 2016, 34(4): 421-436. |
[10] | Jinghui Liu, Changfeng Ma. A NEW NONMONOTONE TRUST REGION ALGORITHM FOR SOLVING UNCONSTRAINED OPTIMIZATION PROBLEMS [J]. Journal of Computational Mathematics, 2014, 32(4): 476-490. |
[11] | Hongen Jia, Kaimin Teng, Kaitai Li. ON THE ERROR ESTIMATES OF A NEW OPERATE SPLITTING METHOD FOR THE NAVIER-STOKES EQUATIONS [J]. Journal of Computational Mathematics, 2014, 32(1): 75-92. |
[12] | Fusheng Wang, Chuanlong Wang, Li Wang. A NEW TRUST-REGION ALGORITHM FOR FINITE MINIMAX PROBLEM [J]. Journal of Computational Mathematics, 2012, 30(3): 262-278. |
[13] | Liping Zhang, Liqun Qi, Yi Xu. LINEAR CONVERGENCE OF THE LZI ALGORITHM FOR WEAKLY POSITIVE TENSORS [J]. Journal of Computational Mathematics, 2012, 30(1): 24-33. |
[14] | Xuebin Wang, Changfeng Ma, Meiyan Li. A SMOOTHING TRUST REGION METHOD FOR NCPS BASED ON THE SMOOTHING GENERALIZED FISCHER-BURMEISTER FUNCTION [J]. Journal of Computational Mathematics, 2011, 29(3): 261-286. |
[15] | Hongwei Li, Xuecheng Tai , Sigurd Ivar Aanonsen . Reservoir Description by Using a Piecewise Constant Level Set Method [J]. Journal of Computational Mathematics, 2008, 26(3): 365-377. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||