To-Do List
2/5/2013
- profile GPU-GPAW with maxed out memory on the GPU (lin)
- gpu-gpaw profiling thoughts:
- cpo thinks improving mpi performance may be difficult
- lin thinks improving mpi performance may be important, since number of k-points decreases in future.
- maybe we could pipeline other work while mpi is running?
- why do we only get a x5 speedup for Pt 3x3x4? (samuli sees 8 to 11)
- see if the mask stuff is called every SCF step (aj)
- think about randomization idea (aj)
- evaluate effectiveness of tzp+PK on na2o4/pt (aj)
- freeze D_aps?
- rpa (jun)
- keep on eye on crashes
- rewrite code for the ZHERK
1/29/2013
- profile GPU-GPAW in grid mode (lin)
- gpu-gpaw profiling thoughts:
- domain decomposition is especially inefficient on GPU: pack as much domain onto one GPU as possible (need larger memory)
- parallelization over k-points remains good
- our current 1-k-point on 8 cores is unrealistic for a 3x3x4
- cpo thinks improving mpi performance may be difficult
- lin thinks improving mpi performance may be important, since number of k-points decreases in future.
- maybe we could pipeline other work while mpi is running?
- why do we only get a x5 speedup for Pt 3x3x4? (samuli sees 8 to 11)
- email the list about the real-density mixer (aj)
- see if the mask stuff is applied every SCF step (aj)
- think about randomization idea (aj)
- evaluate effective of tzp+PK (aj)
- rpa (jun)
- keep on eye on crashes
- rewrite code for the ZHERK
...