Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • we were not 16-byte aligning cuDoubleComplex variables. error showed up "much later" in cuGetVector (error 11) and cudaDeviceSynchronize (error 4). Did binary search to find source of error. How do we program error-checks so that run-time errors show up "immediately"? cuda_safe_call?
  • if we have 1 number used by many threads should it go into shared memory? constant memory? we would think constant memory would be the right answer. shared memory would give a bank conflict.
  • cuda-gdb generates output for kernel launches. slows down the code dramatically? becomes unusable.
  • understand crash with rpa-gpu-expt running rpa_only_Na_cuda.py with nvprof
  • how does cuda deal with memory fragmentation?
  • why does nvvp require more runs than nvprof for the same data?
3/5/2013
  • 2 slides for Samuli
  • run profiling on RPA (lin)
  • memory leak (perhaps related to crashes)
  • adding error check functions
  • base.py get_phi_agp kernel
  • rpa (jun)
    • manuscript

...