Previous Articles    

STOCHASTIC TRUST-REGION METHODS WITH TRUST-REGION RADIUS DEPENDING ON PROBABILISTIC MODELS

Xiaoyu Wang1,2, Ya-xiang Yuan3   

  1. 1. Institute of Computational Mathematics and Scientific/Engineering Computing, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;
    2. University of Chinese Academy of Sciences, Beijing 100049, China;
    3. State Key Laboratory of Scientific/Engineering Computing, Institute of Computational Mathematics and Scientific/Engineering Computing, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2020-06-01 Revised:2020-09-16 Online:2022-03-15 Published:2022-03-29
  • Contact: Xiaoyu Wang,Email: wxy@lsec.cc.ac.cn
  • Supported by:
    This research is partially supported by the National Natural Science Foundation of China 11331012 and 11688101

Xiaoyu Wang, Ya-xiang Yuan. STOCHASTIC TRUST-REGION METHODS WITH TRUST-REGION RADIUS DEPENDING ON PROBABILISTIC MODELS[J]. Journal of Computational Mathematics, 2022, 40(2): 294-334.

We present a stochastic trust-region model-based framework in which its radius is related to the probabilistic models. Especially, we propose a specific algorithm termed STRME, in which the trust-region radius depends linearly on the gradient used to define the latest model. The complexity results of the STRME method in nonconvex, convex and strongly convex settings are presented, which match those of the existing algorithms based on probabilistic properties. In addition, several numerical experiments are carried out to reveal the benefits of the proposed methods compared to the existing stochastic trust-region methods and other relevant stochastic gradient methods.

CLC Number: 

[1] H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Stat., 22:3(1951), 400-407.
[2] J. Duchi, E. Hazan and Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learn. Res., 12:61(2011), 2121-2159.
[3] T. Tieleman and G. Hinton, Lecture 6.5-rmsprop, coursera:Neural networks for machine learning, Technical report, University of Toronto, 2012.
[4] D.P. Kingma and J. Ba, Adam:A method for stochastic optimization, in Proceedings of International Conference on Learning Representations, 2015.
[5] S.J. Reddi, S. Kale and S. Kumar, On the convergence of adam and beyond, in Proceedings of International Conference on Learning Representations, 2018.
[6] R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, in Advances in Neural Information Processing Systems, (2013), 315-323.
[7] A. Defazio, F. Bach and S.L. Julien, SAGA:A fast incremental gradient method with support for non-strongly convex composite objectives, in Advances in Neural Information Processing Systems, (2014), 1646-1654.
[8] L.M. Nguyen, J. Liu, K. Scheinberg and M. Takáč, SARAH:A novel method for machine learning problems using stochastic recursive gradient, in Proceedings of International Conference on Machine Learning, (2017), 2613-2621.
[9] S.J. Reddi, S. Sra, B. Póczos and A. Smola, Fast incremental method for smooth nonconvex optimization, in Proceedings of IEEE 55th Conference on Decision and Control (CDC), (2016), 1971-1977.
[10] S.J. Reddi, A. Hefny, S. Sra, B. Póczos and A.J. Smola, Stochastic variance reduction for nonconvex optimization, in Proceedings of International Conference on Machine Learning, (2016), 314-323.
[11] L.M. Nguyen, J. Liu, K. Scheinberg and M. Takáč, Stochastic recursive gradient algorithm for nonconvex optimization, arXiv:1705.07261, (2017).
[12] M.J.D. Powell, A new algorithm for unconstrained optimization, Nonlinear Program., pages 31-65, Elsevier, 1970.
[13] Y.X. Yuan, Recent advances in trust region algorithms, Math. Program., 151:1(2015), 249-281.
[14] C.J. Lin, R.C. Weng and S.S. Keerthi, Trust region Newton method for logistic regression, J. Mach. Learn. Res., 9:9(2008), 627-650.
[15] C.Y. Hsia, Y. Zhu and C.J. Lin, A study on trust region update rules in Newton methods for large-scale linear classification, in Proceedings of the Ninth Asian Conference on Machine Learning, (2017), 33-48.
[16] Y.N. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli and Y. Bengio, Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, in Advances in Neural Information Processing Systems, (2014), 2933-2941.
[17] V. Dudar, G. Chierchia, E. Chouzenoux, J.C. Pesquet and V. Semenov, A two-stage subspace trust region approach for deep neural network training, in Proceedings of the 25th Signal Processing Conference (EUSIPCO), European, IEEE, (2017), 291-295.
[18] P. Xu, F. Roosta and M.W. Mahoney, Newton-type methods for non-convex optimization under inexact Hessian information, Math. Program., 184:1(2020), 35-70.
[19] A.S. Bandeira, K. Scheinberg and L.N. Vicente, Convergence of trust-region methods based on probabilistic models, SIAM J. Optim., 24:3(2014), 1238-1264.
[20] S. Gratton, C.W. Royer, L.N. Vicente and Z. Zhang, Complexity and global rates of trust-region methods based on probabilistic models, IMA J. Numer. Anal., 38:31579-1597.
[21] C. Cartis and K. Scheinberg, Global convergence rate analysis of unconstrained optimization methods based on probabilistic models, Math. Program., 169:2(2017), 337-375.
[22] A.S. Berahas, L. Cao and K. Scheinberg, Global convergence rate analysis of a generic line search algorithm with noise, arXiv:1910.04055, 2019.
[23] R. Chen, M. Menickelly and K. Scheinberg, Stochastic optimization using a trust-region method and random models, Math. Program., 169:2(2018), 447-487.
[24] J. Blanchet, C. Cartis, M. Menickelly and K. Scheinberg, Convergence rate analysis of a stochastic trust region method via supermartingales, INFORMS J. Optim., 1:2(2019), 92-119.
[25] S. Bellavia and G. Gurioli, Complexity analysis of a stochastic cubic regularisation method under inexact gradient evaluations and dynamic Hessian accuracy, arXiv:2001.10827, (2020).
[26] J.Y. Fan and Y.X. Yuan, A new trust region algorithm with trust region radius converging to zero, in Proceedings of the 5th International Conference on Optimization:Techiniques and Applications, ICOTA, Hong Kong, (2001), 786-794.
[27] F.E. Curtis and K. Scheinberg, Adaptive stochastic optimization, arXiv:2001.06699, 2020.
[28] C. Paquette and K. Scheinberg, A stochastic line search method with convergence rate analysis, SIAM J. Optim., 30:1(2020), 349-376.
[29] A.R. Conn, K. Scheinberg and L.N. Vicente, Global convergence of general derivative-free trustregion algorithms to first- and second-order critical points, SIAM J. Optim., 20:1(2009), 387-415.
[30] A.R. Conn, K. Scheinberg and L.N. Vicente, Introduction to Derivative-Free Optimization, SIAM, Philadelphia, PA, USA, 2009.
[31] J. Larson and S.C. Billups, Stochastic derivative-free optimization using a trust region framework, Comput. Optim. Appl., 64:3(2016), 619-645.
[32] G.N. Grapiglia, J. Yuan and Y.X. Yuan, On the convergence and worst-case complexity of trustregion and regularization methods for unconstrained optimization, Math. Program., 152(2015), 491-520.
[33] R. Pasupathy and S. Ghosh, Simulation optimization:A concise overview and implementation guide, in INFORMS TutORials in Operations Research, (2013), 122-150.
[34] R.H. Byrd, G.M. Chin, J. Nocedal and Y. Wu, Sample size selection in optimization methods for machine learning, Math. Program., 134:1(2012), 127-155.
[35] J. Nocedal and S. Wright, Numerical Optimization, Springer Science & Business Media, second edition, 2006.
[36] X. Wang, S. Ma, D. Goldfarb and W. Liu, Stochastic quasi-Newton methods for nonconvex stochastic optimization, SIAM J. Optim., 27:2(2017), 927-956.
[37] R.H. Byrd, J. Nocedal and R.B. Schnabel, Representations of quasi-Newton matrices and their use in limited memory methods, Math. Program., 63:2(1994), 129-156.
[38] J. Brust, J.B. Erway and R.F. Marcia, On solving L-SR1 trust-region subproblems, Comput. Optim. Appl., 66:2(2017), 245-266.
[39] S. Ghadimi, G. Lan and H. Zhang, Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization, Math. Program., 155:1-2(2016), 267-305.
[40] S.J. Reddi, S. Sra, B. Poczos and A.J. Smola, Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization, in Advances in Neural Information Processing Systems, (2016), 1145-1153.
[41] H. Lin, J. Mairal and Z. Harchaoui, A generic quasi-Newton algorithm for faster gradient-based optimization, arXiv:1610.00960, 2017.
[42] X.Y. Wang, X. Wang and Y.X. Yuan, Stochastic proximal quasi-Newton methods for non-convex composite optimization, Optim. Methods Softw., 34:5(2019), 922-948.
[43] R. Durrett, Probability:Theory and Examples, Cambridge university press, 2019.
[44] P. Chebyshev, Des valeurs moyennes, Journal de Mathématiques Pures et Appliquées, 2:12(1867), 177-184.
[45] D.P. Bertsekas and J.N. Tsitsiklis, Introduction to Probability, Athena Scientific Belmont, Belmont, MA, USA, 2002.
[1] Yuting Chen, Mingyuan Cao, Yueting Yang, Qingdao Huang. AN ADAPTIVE TRUST-REGION METHOD FOR GENERALIZED EIGENVALUES OF SYMMETRIC TENSORS [J]. Journal of Computational Mathematics, 2021, 39(3): 358-374.
[2] Keke Zhang, Hongwei Liu, Zexian Liu. A NEW ADAPTIVE SUBSPACE MINIMIZATION THREE-TERM CONJUGATE GRADIENT ALGORITHM FOR UNCONSTRAINED OPTIMIZATION [J]. Journal of Computational Mathematics, 2021, 39(2): 159-177.
[3] Wenjuan Xue, Weiai Liu. A MULTIDIMENSIONAL FILTER SQP ALGORITHM FOR NONLINEAR PROGRAMMING [J]. Journal of Computational Mathematics, 2020, 38(5): 683-704.
[4] Ernest K. Ryu, Wotao Yin. PROXIMAL-PROXIMAL-GRADIENT METHOD [J]. Journal of Computational Mathematics, 2019, 37(6): 778-812.
[5] Zorana Lužanin, Irena Stojkovska, Milena Kresoja. DESCENT DIRECTION STOCHASTIC APPROXIMATION ALGORITHM WITH ADAPTIVE STEP SIZES [J]. Journal of Computational Mathematics, 2019, 37(1): 76-94.
[6] Bothina El-Sobky, Abdallah Abotahoun. A TRUST-REGION ALGORITHM FOR SOLVING MINI-MAX PROBLEM [J]. Journal of Computational Mathematics, 2018, 36(6): 776-791.
[7] Fan Jiang, Deren Han, Xiaofei Zhang. A TRUST-REGION-BASED ALTERNATING LEAST-SQUARES ALGORITHM FOR TENSOR DECOMPOSITIONS [J]. Journal of Computational Mathematics, 2018, 36(3): 351-373.
[8] Caixia Kou, Zhongwen Chen, Yuhong Dai, Haifei Han. AN AUGMENTED LAGRANGIAN TRUST REGION METHOD WITH A BI-OBJECT STRATEGY [J]. Journal of Computational Mathematics, 2018, 36(3): 331-350.
[9] Yangyang Xu. FAST ALGORITHMS FOR HIGHER-ORDER SINGULAR VALUE DECOMPOSITION FROM INCOMPLETE DATA [J]. Journal of Computational Mathematics, 2017, 35(4): 397-422.
[10] Jinkui Liu, Shengjie Li. SPECTRAL DY-TYPE PROJECTION METHOD FOR NONLINEAR MONOTONE SYSTEM OF EQUATIONS [J]. Journal of Computational Mathematics, 2015, 33(4): 341-355.
[11] Jinghui Liu, Changfeng Ma. A NEW NONMONOTONE TRUST REGION ALGORITHM FOR SOLVING UNCONSTRAINED OPTIMIZATION PROBLEMS [J]. Journal of Computational Mathematics, 2014, 32(4): 476-490.
[12] Fusheng Wang, Chuanlong Wang, Li Wang. A NEW TRUST-REGION ALGORITHM FOR FINITE MINIMAX PROBLEM [J]. Journal of Computational Mathematics, 2012, 30(3): 262-278.
[13] Xuebin Wang, Changfeng Ma, Meiyan Li. A SMOOTHING TRUST REGION METHOD FOR NCPS BASED ON THE SMOOTHING GENERALIZED FISCHER-BURMEISTER FUNCTION [J]. Journal of Computational Mathematics, 2011, 29(3): 261-286.
[14] Duoquan Li. A NEW SQP-FILTER METHOD FOR SOLVING NONLINEAR PROGRAMMING PROBLEMS [J]. Journal of Computational Mathematics, 2006, 24(5): 609-634.
[15] Chang-yin Zhou,Guo-ping He,Yong-li Wang . A NEW CONSTRAINTS IDENTIFICATION TECHNIQUE-BASED QP-FREE ALGORITHM FOR THE SOLUTION OF INEQUALITY CONSTRAINED MINIMIZATION PROBLEMS [J]. Journal of Computational Mathematics, 2006, 24(5): 591-608.
Viewed
Full text


Abstract