Many problems in multi-agent systems, due to communication restrictions, need to be solved in a decentralized manner. There is no data fusion center, so we must rely on shortdistance communication between adjacent nodes to achieve the goal of the whole network. Compared with traditional (centralized) computing, decentralized computing is more suitable for distributed data, less subject to communication and computing bottlenecks, and easier to realize in some applications.
This article overviews the formulations and methods of decentralized consensus optimization. The objective of consensus optimization is that all the variables of the nodes converge to the same vector that minimizes the sum of their objective functions. This problem is solved by calculations at each node and data exchanges between adjacent nodes. Naive decentralized algorithms are much slower than their centralized counterparts. In order to make up for this gap, we review some recent methods through a unified framework of operator splitting.
F. S. Cattivelli, C. G. Lopes, and A. H. Sayed. A diffusion RLS scheme for distributed estimation over adaptive networks. In 2007 IEEE 8th Workshop on Signal Processing Advances in Wireless Communications. 2007, 1-5.
F. S. Cattivelli and A. H. Sayed. Diffusion LMS strategies for distributed estimation. IEEE Transactions on Signal Processing[J]. 2010, 58(3):1035-1048.
Kun Yuan, Qing Ling, and Wotao Yin. On the convergence of decentralized gradient descent. SIAM Journal on Optimization[J]. 2016, 26(3):1835-1854.
Annie I.-An Chen. Fast Distributed First-Order Methods. Thesis, Massachusetts Institute of Technology, 2012.
Dusan Jakovetic, Joao Xavier, and Jose M. F. Moura. Fast distributed gradient methods. IEEE Transactions on Automatic Control[J]. 2014, 59(5):1131-1146.
Jinshan Zeng and Wotao Yin. On nonconvex decentralized gradient descent. IEEE Transactions on Signal Processing[J]. 2018, 66(11):2834-2848.
Ernest K. Ryu and Stephen Boyd. Primer on monotone operator methods. Technical report, Stanford University, 2015.
Ernest Ryu, Robert Hannah, and Wotao Yin. Scaled relative graph:Nonexpansive operators via 2D Euclidean geometry. arXiv:1902.09788, 2019.
Heinz H. Bauschke and Patrick L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics. Springer New York, New York, NY, 2011.
David Kempe, Alin Dobra, and Johannes Gehrke. Gossip-based computation of aggregate information. In Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, FOCS'03, pages 482-, Washington, DC, USA, 2003. IEEE Computer Society.
Angelia Nedic and Alex Olshevsky. Distributed optimization over time-varying directed graphs. IEEE Transactions on Automatic Control[J]. 2015, 60(3):601-615.
R. Glowinski and A. Marroco. Sur l'approximation, paréléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires. ESAIM:Mathematical Modelling and Numerical Analysis[J]. 1975, 9(R2):41-76.
P. L. Lions and B. Mercier. Splitting algorithms for the sum of two nonlinear operators. SIAM Journal on Numerical Analysis[J]. 1979, 16(6):964-979.
Neal Parikh and Stephen Boyd. Proximal algorithms. Foundations and Trends in Optimization[J]. 2014, 1(3):127-239.
Hao-Jun Michael Shi, Shengyinying Tu, Yangyang Xu, and Wotao Yin. A primer on coordinate descent algorithms. UCLA CAM Report 16-67, 2016.
Giovanni Chierchia, Emilie Chouzenoux, Patrick L. Combettes, and Jean-Christophe Pesquet. The proximity operator repository (user's guide). Technical Report version 0.1, proximity-operator.net, 2017.
Ronald E. Bruck Jr. An iterative solution of a variational inequality for certain monotone operators in Hilbert space. Bulletin of the American Mathematical Society[J]. 1975, 81(5):890-893.
Gregory B. Passty. Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. Journal of Mathematical Analysis and Applications[J]. 1979, 72(2):383-390.
Damek Davis and Wotao Yin. Convergence rate analysis of several splitting schemes. In Roland Glowinski, Stanley Osher, and Wotao Yin, editors, Splitting Methods in Communication, Imaging, Science and Engineering, Chapter 4, pages 115-163. Springer, 2016.
Damek Davis and Wotao Yin. Faster convergence rates of relaxed Peaceman-Rachford and ADMM under regularity assumptions. Mathematics of Operations Research[J]. 2017, 42(3):783-805.
Antonin Chambolle and Thomas Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision[J]. 2011, 40(1):120-145.
Laurent Condat. A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. Journal of Optimization Theory and Applications[J]. 2013, 158(2):460-479.
B. C. Vǔ. A splitting algorithm for dual monotone inclusions involving cocoercive operators. Advances in Computational Mathematics[J]. 2011, 38(3):667-681.
Angelia Nedic and Asuman Ozdaglar. Distributed subgradient methods for multi-agent optimization. IEEE Transactions on Automatic Control[J]. 2009, 54(1):48-61.
Damek Davis and Wotao Yin. A three-operator splitting scheme and its optimization applications. Set-Valued and Variational Analysis[J]. 2017, 25(4):829-858.
Dimitri P. Bertsekas and John N. Tsitsiklis. Parallel and Distributed Computation:Numerical Methods. Prentice Hall, Englewood Cliffs, N.J, 1989.
Gonzalo Mateos, Juan Andrés Bazerque, and Georgios B. Giannakis. Distributed sparse linear regression. IEEE Transactions on Signal Processing[J]. 2010, 58(10):5262-5276.
Ioannis D. Schizas, Alejandro Ribeiro, and Georgios B. Giannakis. Consensus in ad hoc WSNs with noisy links-part I:Distributed estimation of deterministic signals. IEEE Transactions on Signal Processing[J]. 2008, 56(1):350-364.
Qing Ling and Zhi Tian. Decentralized sparse signal recovery for compressive sleeping wireless sensor networks. IEEE Transactions on Signal Processing[J]. 2010, 58(7):3816-3827.
Wei Shi, Qing Ling, Kun Yuan, Gang Wu, and Wotao Yin. On the linear convergence of the ADMM in decentralized consensus optimization. IEEE Transactions on Signal Processing[J]. 2014. 62(7):1750-1761.
Tsung-Hui Chang, Mingyi Hong, and Xiangfeng Wang. Multi-agent distributed optimization via inexact consensus ADMM. IEEE Transactions on Signal Processing[J]. 2015, 63(2):482-497.
E. Wei and A. Ozdaglar. On the o(1/k) convergence of asynchronous distributed alternating direction method of multipliers. In 2013 IEEE Global Conference on Signal and Information Processing, 2013, 551-554.
Wei Shi, Qing Ling, Gang Wu, and Wotao Yin. EXTRA:An exact first-order algorithm for decentralized consensus optimization. SIAM Journal on Optimization[J]. 2015, 25(2):944-966.
Wei Shi, Qing Ling, Gang Wu, and Wotao Yin. A proximal gradient algorithm for decentralized composite optimization. IEEE Transactions on Signal Processing[J]. 2015, 63(22):6013-6023.
Håkan Terelius, Ufuk Topcu, and Richard M. Murray. Decentralized multi-agent optimization via dual decomposition. IFAC Proceedings Volumes[J]. 2011, 44(1):11245-11251.
Minghui Zhu and Sonia Martínez. Discrete-time dynamic average consensus. Automatica[J]. 2010, 46(2):322-329.
A. Nedi?, A. Olshevsky, W. Shi, and C. A. Uribe. Geometrically convergent distributed optimization with uncoordinated step-sizes. In 2017 American Control Conference (ACC), 2017, 3950-3955.
Jinming Xu, Shanying Zhu, Yeng Chai Soh, and Lihua Xie. Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes. In 201554th IEEE Conference on Decision and Control (CDC), 2055-2060, Osaka, 2015. IEEE.
Zhi Li, Wei Shi, and Ming Yan. A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates. IEEE Transactions on Signal Processing, early access, 2019.
G. Qu and N. Li. Harnessing smoothness to accelerate distributed optimization. IEEE Transactions on Control of Network Systems[J]. 2018, 5(3):1245-1260.
Angelia Nedi?, Alex Olshevsky, and Wei Shi. Achieving geometric convergence for distributed optimization over time-varying graphs. SIAM Journal on Optimization[J] 2017, 27(4):2597-2633.
Kun Yuan, Bicheng Ying, Xiaochuan Zhao, and Ali H. Sayed. Exact diffusion for distributed optimization and learning-Part I:Algorithm development. IEEE Transactions on Signal Processing[J]. 2019, 67(3):708-723.
Kun Yuan, Bicheng Ying, Xiaochuan Zhao, and Ali H. Sayed. Exact diffusion for distributed optimization and learning-Part Ⅱ:Convergence analysis. IEEE Transactions on Signal Processing[J]. 2019, 67(3):724-739.
Kevin Scaman, Francis Bach, Sebastien Bubeck, Laurent Massoulié, and Yin Tat Lee. Optimal algorithms for non-smooth distributed optimization in networks. In Advances in Neural Information Processing Systems 31, pages 2740-2749. Curran Associates, Inc., 2018.
Guannan Qu and Na Li. Accelerated distributed Nesterov gradient descent. arXiv:1705.07176, 2017.
César A. Uribe, Soomin Lee, Alexander Gasnikov, and Angelia Nedi?. A dual approach for optimal algorithms in distributed optimization over networks. arXiv:1809.00710, 2018.
Huan Li, Cong Fang, Wotao Yin, and Zhouchen Lin. A sharp convergence rate analysis for distributed accelerated gradient methods. arXiv:1810.01053, 2018.
Kevin Seaman, Francis Bach, Sébastien Bubeck, Yin Tat Lee, and Laurent Massoulié. Optimal algorithms for smooth and strongly convex distributed optimization in networks. In Proceedings of the 34th International Conference on Machine Learning 70, pages 3027-3036, 2017.
Tianyu Wu, Kun Yuan, Qing Ling, Wotao Yin, and Ali H. Sayed. Decentralized consensus optimization with asynchrony and delays. IEEE Transactions on Signal and Information Processing over Networks[J]. 2018, 4(2):293-307.
Xiangru Lian, Ce Zhang, Huan Zhang, Cho-Jui Hsieh, Wei Zhang, and Ji Liu. Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 5330-5340. Curran Associates, Inc., 2017.
Xiangru Lian, Wei Zhang, Ce Zhang, and Ji Liu. Asynchronous decentralized parallel stochastic gradient descent. In International Conference on Machine Learning, pages 3043-3052, 2018.