Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • we were not 16-byte aligning cuDoubleComplex variables. error showed up "much later" in cuGetVector (error 11) and cudaDeviceSynchronize (error 4). Did binary search to find source of error. How do we program so that errors show up "immediately"? cuda_safe_call?
  • if we have 1 number used by many threads should it go into shared memory? constant memory? we would think constant memory would be the right answer. shared memory would give a bank conflict.
  • cuda-gdb generates output for kernel launches. slows down the code dramatically? becomes unusable.
  • understand crash with rpa-gpu-expt running rpa_only_Na_cuda.py with nvprof

...

3/5/2013
  • 2 slides for Samuli
  • run profiling on RPA (lin)
  • memory leak (perhaps related to crashes)
  • adding error check functions
  • base.py get_phi_agp kernel
  • rpa (jun)
    • manuscript
2/26/2013
  • 2 slides for Samuli
  • more structs for RPA (lin)
  • run nvvp on RPA (lin)
  • think about EXX
  • rpa (jun)
    • manuscript
2/19/2013
  • more structs for RPA (lin)
  • commit code
  • rpa (jun)
    • keep on eye on crashes
  • EXX on GPUs
    • fix MPI stuff in EXX
    • understand why it doesn't speed up
    • think about whether or not we tackle EXX yet

...