NUMERICAL SIMULATION OF PLANETARY FLUID DYNAMICS ON CPU-MIC HETEROGENEOUS MANY-CORE SYSTEMS
Wu Changmao1, Yang Chao1,2, Yin Liang3, Liu Fangfang1, Sun Qiao1, Li Ligang4
1. Laboratory of Parallel Software and Computational Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;
2. State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;
3. School of Engineering Science, University of Chinese Academy of Sciences, Beijing 100049, China;
4. Shanghai Astronomical Observatory, Chinese Academy of Sciences, Shanghai 200030, China
Massively parallel computing is becoming a primary tool for the numerical simulation of planetary fluid dynamics. In this paper, Numerical simulation of the planetary fluid dynamics for distributed memory Xeon Phi-accelerated systems is studied. Firstly, we start from a legacy parallel code[1-3] using PETSc software package, which employs a pure MPI approach for parallel computing, to date, is in lack of support for multi-threaded parallelism on manycore accelerated systems, and then we extend the legacy code to multi-threaded parallelism on Xeon Phi-accelerated systems. Furthermore, based on PETSc software package, a sparse linear solver for Xeon Phi-accelerated cluster, which utilizes restarted generalized residual method(GMRES(m)), is presented and optimized. Secondly, a novel sparse matrix-vector multiplication(SpMV) algorithm for Xeon Phi-accelerated cluster is proposed, it combines highly aggressive use of asynchrony with offload, compute, communication, all of which serve the overlap of computation and communication. What's more, based on our SpMV algorithm, a polynomial preconditioner is given, which mainly consists of SpMV operations, hide and reduce communication, whether to local memory, across the network, or over PCIe. Finally, some optimized measures are taken to the extended code. Experiments on Tianhe- 2 Supercomputer show that as compared to the original code, our Xeon Phi-accelerated design is able to deliver 6.93x and 6.00x speedups for single MIC device and 64 MIC devices, respectively.
. NUMERICAL SIMULATION OF PLANETARY FLUID DYNAMICS ON CPU-MIC HETEROGENEOUS MANY-CORE SYSTEMS[J]. Journal of Numerical Methods and Computer Applicat, 2017, 38(3): 197-214.
Chan K H, Li Ligang and Liao Xinhao. Modelling the core convection using finite element and finite difference methods[J]. Physics of the Earth and Planetary Interiors, 2006, 157(1-2):124-138.
Yang Chao, Zhang Yunquan and Li Ligang. Numerical Simulation of the Thermal Convection in the Earth's Outer Core[C]. 12th IEEE International Conference on High Performance Computing and Communications, HPCC. 2010, 552-555.
Yang Chao, Li Ligang and Zhang Yunquan. Development of a Scalable Solver for theEarth's Outer Core[C]. High Performance Computing and Applications, Second International Conference, HPCA 2009, Shanghai, China, August 10-12, 2009, Revised Selected Papers
Satish Balay and Shrirang Abhyankar and Mark F. Adams and Jed Brown and Peter Brune and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William D. Gropp and Dinesh Kaushik and Matthew G. Knepley and Lois Curfman McInnes and Karl Rupp and Barry F. Smith and Stefano Zampini and Hong Zhang and Hong Zhang. PETSc Web page[OL]. http://www.mcs.anl.gov/petsc, 2016.
Zhang, Keke and Schubert, Gerald. Magnetohydrodynamics in Rapidly Rotating spherical Systems[J]. Annual Review of Fluid Mechanics. 2000, 32(1):409-443.
Gilman P A. Dynamically consistent nonlinear dynamos driven by convection in a rotating spherical shell[J]. Astrophysical Journal Supplement. 1983, 53(53):243-268.
Zhang K K and Busse F H. Convection driven magnetohydrodynamic dynamos in rotating spherical shells[J]. Geophysical and Astrophysical Fluid Dynamics. 1989, 49(1):97-116.
Glatzmaier G A, Roberts P H. A three-dimensional convective dynamo solution with rotating and finitely conducting inner core and mantle[J]. Physics of the Earth and Planetary Interiors, 1995, 91(1-3):63-75.
Kageyama A, Sato T. Generation mechanism of a dipole field by a magnetohydrodynamic dynamo[J]. Physical Review E Statistical Physics Plasmas Fluids and Related Interdisciplinary Topics, 1997, 55(4):4617-4626.
Moritz Heimpel and Jonathan Aurnou. Turbulent convection in rapidly rotating spherical shells:A model for equatorial and high latitude jets on Jupiter and Saturn[J]. 2007, 187(2):540-557.
Liao X, Feng T, Zhang K. On the Saturation and Temporal Variation of Mean Zonal Flows:An Implication for Equatorial Jets on Giant Planets[J]. Astrophysical Journal, 2008, 666(1).
Heimpel M, Aurnou J, Wicht J. Simulation of equatorial and high-latitude jets on Jupiter in a deep convection model.[J]. Nature, 2005, 438(7065):193-6.
Showman A P, Kaspi Y, Flierl G R. Scaling laws for convection and jet speeds in the giant planets[J]. Icarus, 2010, 211(2):1258-1273.
Erich Strohmaier and Horst Simon and Hans Meuer and Jack Dongarra. TOP500 Web page[OL]. http://www.top500.org/, 2016.
Chandrasekhar S. Hydrodynamic and Hydromagnetic Stability[M]. Dover edition. New York:Dover Publications, Inc., 1981.
Christensen U R, Aubert J, Cardin P, Dormy E, Gibbons S, Glatzmaier G A, Grote E, Honkura Y, Jones C, Kono M, Matsushima M, Sakuraba A, Takahashi F, Tilgner A, Wicht J, Zhang K. A numerical dynamo benchmark[J]. Physics of the Earth and Planetary Interiors. 2001, 128(1-4):25-34.
Dukowicz J K, Dvinsky A S. Approximate factorisation as a high order splitting for the incompressible flow equations[J]. Journal of Computational Physics, 1992, 102(2):336-347.
Saad Y, Schultz M H. GMRES:a generalized minimal residual algorithm for solving nonsymmetric linear systems[J]. SIAM Journal on Scientific and Statistical Computing, 2006, 7(3):856-869.
Yousef Saad. Iterative Methods for Sparse Linear Systems[M]. Society for Industrial and Applied Mathematics, 2003.
Philippe Tillet and Karl Rupp and Siegfried Selberherr and Chin-Teng Lin. Towards PerformancePortable, Scalable, and Convenient Linear Algebra[C]. Presented as part of the 5th USENIX Workshop on Hot Topics in Parallelism, 2013, Berkeley, CA.
PARALUTION-the library for iterative sparse methods on CPU and GPU[OL]. http://www.paralution.com/, 2016.
Minden, Victor and Smith, Barry and Knepley, Matthew G. Preliminary Implementation of PETSc Using GPUs[M]. Springer Berlin Heidelberg, 2013.
Baker C G and Heroux, M A. Tpetra, and the Use of Generic Programming in Scientific Computing[J]. Sci. Program, 2012, 20(2):115-128.
Yamazaki I, Anzt H, Tomov S, et al. Improving the Performance of CA-GMRES on Multicores with Multiple GPUs[C]. IEEE International Parallel and Distributed Processing Symposium, IPDPS. 2014, 382-391.
Cramer T, Schmidl D, Klemm M, et al. OpenMP Programming on Intel Xeon Phi Coprocessors:An Early Performance Comparison[J]. 2012.
Liu X, Smelyanskiy M, Chow E, et al. Efficient sparse matrix-vector multiplication on x86-based many-core processors[C]. International ACM Conference on International Conference on Supercomputing. 2013:273-282.
Liu W, Vinter B. CSR5:An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication[C]. ACM on International Conference on Supercomputing. ACM, 2015, 339-350.
Yan S, Li C, Zhang Y, et al. yaSpMV:Yet Another SpMV Framework on GPUs[C]. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP, 2014, 107-118.