Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • constants memory
  • texture memory
  • optimization tricks: pre-fetch etc.
  • what does a queued warp do? (does it pre-fetch the memory)
  • reducing number of registers in kernel (does compiler typically do this optimally?)
  • how to learn with nvvp if we're memory/flops limited
  • understanding the nvvp columns
  • ask about zher speedup numbers
12/

...

11/2012
  • understand nvidia zher speedup plot (including cuda5) (jun/cpo)
  • libxc on gpu (lin)
    • use CUDA5use common functional file for CPU/GPUuse common work file for CPU/GPUread samuli old talk
    • run 3x4x3 pt system
  • digest RPA timing measurements (lin)
  • multi-alpha zher at a lower prioritythink about moving lambda calc to GPU (jun)
  • reduce registers? prefetch?
  • explore the parameter space: tile-size
  • try multiple surfaces with jacapo/gpaw-pw (aj)
  • paper (jun)
  • try calling dacapo density mixing from GPAW (cpo)
  • install GPAW on Keeneland (cpo)
  • make sure all libxc self-tests runmove suncatgpu01 to CUDA5 (cpo)
  • can the alphas for the nt_G really be used for the D's?
12/4/2012
  • understand nvidia zher speedup plot (jun/cpo)
  • libxc on gpu (lin)
    • use CUDA5
    • use common functional file for CPU/GPU
    • use common work file for CPU/GPU
    • read samuli old talk
    • run 3x4x3 pt system
  • RPA timing measurements (lin)
  • multi-alpha zher at a lower priority(jun)
    • reduce registers? prefetch?
    • explore the parameter space: tile-size
  • try multiple surfaces with jacapo/gpaw-pw (aj)
  • paper (jun)
  • try calling dacapo density mixing from GPAW (cpo)
  • install GPAW on Keeneland (cpo)
  • make sure all libxc self-tests run
  • move suncatgpu01 to CUDA5 (cpo)

...