You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Next »

big steps:

  1. standalone main calls "work" kernel we call with GPU pointers (completed)
  2. standalone main calls exc_vxc (RPBE only) interface with GPU pointers
  3. gpaw CPU version uses exc_vxc on GPU (we give exc_vxc GPU pointers)
  4. gpaw GPU version uses exc_vxc on GPU

plan for step(2):

  • starting to work in libxc source
  • start with work_gga_x.c. make work_gga_x a "shell".
  • try using nvcc for everything

questions:

  • we may run out of memory when putting more stuff on GPU
  • can gga.c call a "kernel pointer" or does work_gga_x become a "shell" that calls kernel?

to make an XC(gga_type) pointer "p" on the device:

  • need the size of params
  • swap out the info/params pointers for device pointers
  • p gpu-initialization happens at func_init time

to make "work" functions into a kernel:

  • need a _global_ in the work
  • need a _device_ in the rpbe

Porting libxc: Lessons Learned

  • use nvcc for all mixed host/gpu code
  • originally had problems linking with nvcc. maybe messed up by "-x cu"? used gcc instead. but later nvcc linked OK for top level executable.
  • nvcc does C++ mangling. need extern "C" in some cases.

Things we need to deal with:

  • can't call external _device_ function with nvcc-compiled code? (workaround: use .cuh files with statically declared functions)
  • what to do about k functionals? (multiple includes of work_gga_x.c)
  • kludged local "static/global" variables (statics inside kernel don't work?)
  • make the copying of "p" beautiful
    • size of params problem (might be other similar problems here)
    • where do we do the copy?
  • gpu calling back to host code (and potentially vice-versa)
  • p sometimes needs to contain GPU function pointers and perhaps associated data (e.g. gga calling lda)
  • stride problem for spin indices
  • memset in gga.c etc.

Process for RPBE:

  • use nvcc for everything (./configure CC=nvcc CFLAGS="-arch=sm_20")
  • rename gga_x_rpbe.c to .cu, also in src/Makefile
  • added _device_ to gga_x_rpbe.c, and "extern C" to "info" struct
  • included work_gga_x.cu in the gga_x_rpbe.cu with _global_
  • removed the memset in gga.c

CUDA Problems

  • Unique symbol problem, including the same .cub in multiple files can't set breakpoint. Need cuda5 for linkers?
  • Slow startup, 30 sec wait.
  • sometimes, we can't step into a function (if it is a file included in a ".cuh"? (multi-layer include))
(cuda-gdb) 
200	/tmp/tmpxft_00001dc8_00000000-7_gga_c_pbe.cpp3.i: No such file or directory.
	in /tmp/tmpxft_00001dc8_00000000-7_gga_c_pbe.cpp3.i



(cuda-gdb) 
func (xs=0xdddddddddddddddd, p=warning: Variable is not live at this point. Value is undetermined.
0x0, order=warning: Variable is not live at this point. Value is undetermined.
0, rs=warning: Variable is not live at this point. Value is undetermined.
0, zeta=-1.4568159901474629e+144, xt=warning: Variable is not live at this point. Value is undetermined.
0, f=warning: Variable is not live at this point. Value is undetermined.
0x0, dfdrs=0xdddddddddddddddd, dfdz=0xdddddddddddddddd, dfdxt=0xdddddddddddddddd, dfdxs=warning: Variable is not live at this point. Value is undetermined.
0x0, d2fdrs2=0xdddddddddddddddd, d2fdrsz=0xdddddddddddddddd, d2fdrsxt=0xdddddddddddddddd, d2fdrsxs=0xdddddddddddddddd, d2fdz2=0xdddddddddddddddd, d2fdzxt=0xdddddddddddddddd, d2fdzxs=0xdddddddddddddddd, d2fdxt2=0xdddddddddddddddd, d2fdxtxs=0xdddddddddddddddd, d2fdxs2=0xdddddddd) at gga_c_pbe.cu:271
271	}

Recipe for Moving Functional to GPU

Recipe for Moving General "Work" routine to GPU

  • No labels