Page History

...

constants memory
texture memory
optimization tricks: pre-fetch etc.
what does a queued warp do? (does it pre-fetch the memory)
reducing number of registers in kernel (does compiler typically do this optimally?)
how to learn with nvvp if we're memory/flops limited
understanding the nvvp columns
ask about zher speedup numbers

...

understand nvidia zher speedup plot (including cuda5) (jun/cpo)
libxc on gpu (lin)
- use CUDA5use common functional file for CPU/GPUuse common work file for CPU/GPUread samuli old talk
- run 3x4x3 pt system
digest RPA timing measurements (lin)
multi-alpha zher at a lower prioritythink about moving lambda calc to GPU (jun)
reduce registers? prefetch?
explore the parameter space: tile-size
paper (jun)
try calling dacapo density mixing from GPAW (cpo)
install GPAW on Keeneland (cpo)
make sure all libxc self-tests runmove suncatgpu01 to CUDA5 (cpo)
can the alphas for the nt_G really be used for the D's?

understand nvidia zher speedup plot (jun/cpo)
libxc on gpu (lin)
- use CUDA5
- use common functional file for CPU/GPU
- use common work file for CPU/GPU
- read samuli old talk
- run 3x4x3 pt system
RPA timing measurements (lin)
multi-alpha zher at a lower priority(jun)
- reduce registers? prefetch?
- explore the parameter space: tile-size
try multiple surfaces with jacapo/gpaw-pw (aj)
paper (jun)
try calling dacapo density mixing from GPAW (cpo)
install GPAW on Keeneland (cpo)
make sure all libxc self-tests run
move suncatgpu01 to CUDA5 (cpo)

...

Versions Compared