...
- understand nvidia zgemm speedup plot (jun/cpo)
- ANSWER: without thread: 29 faster on GPU. With 6 thread openMP get 5, which agrees with nvidia
- understand why zher is x6 better on GPU but we see x24 with RPA (will put device sync in code) (jun/cpo)
- does cuda5 improve ZHER? (jun/cpo) ANSWER: no improvement
- libxc on gpu (lin)
- use common work file for CPU/GPU
- digest RPA timing measurements (lin)
- think about moving lambda calc to GPU (jun)
- try multiple surfaces with jacapo/gpaw-pw (aj)
- paper (jun)
- try calling dacapo density mixing from GPAW (cpo)
- install GPAW on Keeneland (cpo)
- make sure all libxc self-tests run
- can the alphas for the nt_G really be used for the D's?
...