Where academic tradition
meets the exciting future

Lattice Boltzmann Method on GPUs: A Comparison Between OpenACC and CUDA

Fredrik Robertsén, Keijo Mattila, Jan Westerholm, Lattice Boltzmann Method on GPUs: A Comparison Between OpenACC and CUDA. TUCS Technical Reports 1191, TUCS, 2018.


As the availability of general purpose graphics processors is becoming more widespread in high performance computing, the methods with which GPUs are programmed are growing more important. Languages such as CUDA offer great control but at the expense of ease of programming. In this paper we discuss the use and the general techniques of the directive based approach of OpenACC in creating a GPU accelerated lattice Boltzmann code for fluid simulations. Above all, we assess the relevance of OpenACC in this context by comparing the computational performance of identical OpenACC and CUDA implementations. Our first finding is that in single GPU computing the CUDA code outperforms the corresponding OpenACC code, but the difference is not significant. The performance of the CUDA code was improved by tuning configuration parameters, like the number of threads per block, while the OpenACC code appeared unresponsive to such measures: however, the observed performances of OpenACC and CUDA implementations where within 7\% in all cases. Our second major finding is that in multi-GPU computing the CUDA code was up to 22\% faster. After further investigation, this more substantial difference was partly attributed to additional overhead for kernel launches with OpenACC code, partly to the smaller bandwidth achieved for PCI-e transfers between the host and the GPU with the OpenACC compared to the CUDA code. The observed performances, and the main causes of the relatively small performance losses, allow us to conclude that OpenACC is a viable option for implementing scientific computing on GPUs.

BibTeX entry:

  title = {Lattice Boltzmann Method on GPUs: A Comparison Between OpenACC and CUDA},
  author = {Robertsén, Fredrik and Mattila, Keijo and Westerholm, Jan},
  number = {1191},
  series = {TUCS Technical Reports},
  publisher = {TUCS},
  year = {2018},

Belongs to TUCS Research Unit(s): Embedded Systems Laboratory (ESLAB)

Edit publication