Introduction

The work below describes a process that allows libxc to run on GPUs, using the natural parallelization over gridpoints. It was done with libxc 1.2.0, but we later also did it (with some relatively minor modifications) for libxc 2.0.0. We obtained speedups of 30x-100x (cpu+gpu vs. cpu).

We did this as an introductory project to learn how to port software to GPUs. The documentation below is "crude" because there is currently not much demand for libxc on GPUs. But perhaps in the future it may be useful for people.

Building libxc 1.2.0 adding additional GPU-compatible functionals

Running a GPAW Calculation with GPU libxc Functionals

For Developers: Porting libxc Functional to GPU

This requires that the functional use the common "work.c" mechanism. Some functionals (e.g. tpss_c seem to not do this yet).

For Developers: Porting General "Work" routine to GPU

This need to be done once for LDA/GGA/MGGA x/c for the common "work.c" file (total of 6 of these)

libxc on GPUs: Issues

CUDA Problems

(cuda-gdb)
200	/tmp/tmpxft_00001dc8_00000000-7_gga_c_pbe.cpp3.i: No such file or directory.
	in /tmp/tmpxft_00001dc8_00000000-7_gga_c_pbe.cpp3.i



(cuda-gdb)
func (xs=0xdddddddddddddddd, p=warning: Variable is not live at this point. Value is undetermined.
0x0, order=warning: Variable is not live at this point. Value is undetermined.
0, rs=warning: Variable is not live at this point. Value is undetermined.
0, zeta=-1.4568159901474629e+144, xt=warning: Variable is not live at this point. Value is undetermined.
0, f=warning: Variable is not live at this point. Value is undetermined.
0x0, dfdrs=0xdddddddddddddddd, dfdz=0xdddddddddddddddd, dfdxt=0xdddddddddddddddd, dfdxs=warning: Variable is not live at this point. Value is undetermined.
0x0, d2fdrs2=0xdddddddddddddddd, d2fdrsz=0xdddddddddddddddd, d2fdrsxt=0xdddddddddddddddd, d2fdrsxs=0xdddddddddddddddd, d2fdz2=0xdddddddddddddddd, d2fdzxt=0xdddddddddddddddd, d2fdzxs=0xdddddddddddddddd, d2fdxt2=0xdddddddddddddddd, d2fdxtxs=0xdddddddddddddddd, d2fdxs2=0xdddddddd) at gga_c_pbe.cu:271
271	}