Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • constants memory
  • texture memory
  • optimization tricks: pre-fetch etc.
  • what does a queued warp do? (does it pre-fetch the memory)
  • reducing number of registers in kernel (does compiler typically do this optimally?)
  • how to learn with nvvp if we're memory/flops limited
  • understanding the nvvp columns
  • best way to associate right GPU with right core (e.g. "taskset", "numactl")
  • ask about zher speedup numbers: for 4kx4k why does gemm improve by x30 but zher improves by x6?
  • using automake with cuda and c in one library?

...

1/

...

8/

...

2013
  • libxc on gpu (lin)
    • work on automake stuff
    • get the cleaned-up ifdef version from Miguel
  • digest RPA timing measurements (lin)
  • AJ and cpo start meeting once per week (friday) to work/strategize on convergence
  • paper (jun)
  • redo timing measurements (jun/lin)
  • understand new GPU box memory slowness (cpo)
12/18/2012
  • libxc on gpu (lin)
    • use common work file for CPU/GPU
  • digest RPA timing measurements (lin)
  • paper (jun)
  • redo timing measurements (jun)
  • understand timing measurements more fully (jun)
  • dacapo density mixing vs. GPAW (cpo)

...