Academia.eduAcademia.edu

Outline

Programming CUDA and OpenCL: A Case Study Using Modern C++ Libraries

2013, SIAM Journal on Scientific Computing

https://doi.org/10.1137/120903683

Abstract

We present a comparison of several modern C ++ libraries providing high-level interfaces for programming multi-and many-core architectures on top of CUDA or OpenCL. The comparison focuses on the solution of ordinary differential equations and is based on odeint, a framework for the solution of systems of ordinary differential equations. Odeint is designed in a very flexible way and may be easily adapted for effective use of libraries such as MTL4, VexCL, or Vi-ennaCL, using CUDA or OpenCL technologies. We found that CUDA and OpenCL work equally well for problems of large sizes, while OpenCL has higher overhead for smaller problems. Furthermore, we show that modern high-level libraries allow to effectively use the computational resources of many-core GPUs or multi-core CPUs without much knowledge of the underlying technologies.

References (35)

  1. CUDA Toolkit 4.2. CUSPARSE Library, NVIDIA Corporation, Feb. 2012. Version 4.2.
  2. K. Ahnert, Odeint v2 -Solving Ordinary Differential Equations in C++. http://www.codeproject.com/Articles/268589/odeint-v2-Solving-ordinary- differential-equations, Oct 2011.
  3. K. Ahnert and M. Mulansky, Odeint -Solving Ordinary Differential Equations in C++, in IP Conf. Proc., vol. 1389, 2011, pp. 1586-1589.
  4. P. Atkins and J. de Paula, Physical Chemistry, W. H. Freeman, 7th ed., Dec. 2001.
  5. N. Bell and M. Garland, Efficient Sparse Matrix-Vector Multiplication on CUDA, NVIDIA Technical Report NVR-2008-004, NVIDIA Corporation, 2008.
  6. N. Bell and J. Hoberock, Thrust: A Productivity-Oriented Library for CUDA, Elsevier, 2011, ch. 26, pp. 359-371.
  7. R. Bordawekar, U. Bondhugula, and R. Rao, Can CPUs Match GPUs on Performance with Productivity? Experiences with Optimizing a FLOP-intensive Application on CPUs and GPU, technical report, IBM T. J. Watson Research Center, 2010.
  8. F. Brauer and C. Castillo-Chavez, Mathematical Models in Population Biology and Epi- demiology, Springer, 1 ed., Mar. 2001.
  9. A.H. Cohen, P.J. Holmes, and R.H. Rand, The Nature of the Coupling Between Segmental Oscillators of the Lamprey Spinal Generator for Locomotion: A Mathematical Model, J. Math. Biol., 13 (1982), pp. 345-369.
  10. T. Dauxois and S. Ruffo, Fermi-Pasta-Ulam Nonlinear Lattice Oscillations, Scholarpedia, 3 (2008), p. 5538.
  11. D. Demidov, VexCL: Vector Expression Template Library for OpenCL. http: //www.codeproject.com/Articles/415058/VexCL-Vector-expression-template- library-for-OpenC, Jul 2012.
  12. P. Gottschling and T. Hoefler, Productive Parallel Linear Algebra Programming with Un- structured Topology Adaption, in International Symposium on Cluster, Cloud and Grid Computing, Ottawa, Canada, May 2012, ACM/IEEE.
  13. E. Hairer, S. P. Nørsett, and G. Wanner, Solving Ordinary Differential Equations I: Non- stiff Problems, Springer, Berlin, 2nd ed. 1993. corr. 3rd printing. 1993. corr. 3rd. ed., 2009.
  14. E. Hairer and G. Wanner, Solving Ordinary Differential Equations II: Stiff and Differential- Algebraic Problems, Springer, Berlin, 2nd ed. 1996. 2nd printing. ed., 2010.
  15. E. Hairer, G. Wanner, and C. Lubich, Geometric Numerical Integration: Structure- Preserving Algorithms for Ordinary Differential Equations, Springer-Verlag Gmbh, 2. au- flage. ed., 2006.
  16. D. Helbing, Traffic and Related Self-Driven Many-Particle Systems, Rev. Mod. Phys., 73 (2001), pp. 1067-1141.
  17. W. Hundsdorfer and J. G. Verwer, Numerical Solution of Time-Dependent Advection- Diffusion-Reaction Equations, no. 33 in Springer series in computational mathematics, Springer, first ed., 2003.
  18. E. M. Izhikevich, Dynamical Systems in Neuroscience: The Geometry of Excitability and Bursting, The MIT Press, 1 ed., Nov. 2006.
  19. E. M. Izhikevich and B. Ermentrout, Phase Model, Scholarpedia, 3 (2008), p. 1487.
  20. Y. Kuramoto, Chemical Oscillations, Waves, and Turbulence, Springer, New York, 1984.
  21. L. D. Landau and E. M. Lifshitz, Mechanics, Third Edition: Volume 1, Butterworth- Heinemann, 3 ed., Jan. 1976.
  22. B. Leimkuhler and S. Reich, Simulating Hamiltonian Dynamics, Cambridge University Press, Cambridge, 2004.
  23. E. N. Lorenz, Deterministic Nonperiodic Flow, Journal of Atmospheric Sciences, 20 (1963), pp. 130-141.
  24. M. Mulansky and A. Pikovsky, Scaling Properties of Energy Spreading in Nonlinear Hamil- tonian Two-Dimensional Lattices, arXiv:1207.0721, (2012).
  25. J. D. Murray, Mathematical Biology, Springer, Berlin, 1993.
  26. E. Ott, Chaos in Dynamical Systems, Cambridge Univ. Press, 2nd edition, Cambridge, 2002.
  27. A. Pikovsky, M. Rosenblum, and J. Kurths, Synchronization: A Universal Concept in Nonlinear Sciences, Cambridge University Press, Cambridge, Mass., 2001.
  28. T. Pöschel and T. Schwager, Computational Granular Dynamics: Models and Algorithms, Springer Berlin Heidelberg, 1 ed., May 2005.
  29. W. H. Press, S. T. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C: the Art of Scientific Computing, Cambridge University Press, Cambridge, England, second ed., 1992.
  30. K. Rupp, F. Rudolf, and J. Weinbub, ViennaCL -A High Level Linear Algebra Library for GPUs and Multi-Core CPUs, in Intl. Workshop on GPUs and Scientific Applications, 2010, pp. 51-56.
  31. P. Sheng, Introduction to Wave Scattering, Localization and Mesoscopic Phenomena, Springer, Berlin, 2006.
  32. Ph. Tillet, K. Rupp, and S. Selberherr, An Automatic OpenCL Compute Kernel Generator for Basic Linear Algebra Operations, 2012. Accepted for the SpringSim High Performance Computing Symposium.
  33. D. Vandevoorde and N.M. Josuttis, C++ Templates, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2002.
  34. T. Veldhuizen, Expression Templates, C++ Report, 7 (1995), pp. 26-31.
  35. T. Veldhuizen, Techniques for Scientific C++, Tech. Report 542, Indiana University, Bloom- ington, Aug. 2000.