To-Do List
5/9/2012
- rpbe kernel (lin)
- try cudamallochost with memcpyasync
- fix stream behavior and try with 1,2,4,8,16 streams
- understand stream behaviour with nvvp
- zher streams(jun)
- in benchmark, have separately variable nstream/nw
- can we see whether we have 4 or 16 streams?
- understand stream behaviour with nvvp
- work on libxc (cposeparate stream and n-omega parameters (jun)
5/2/2012
- looking at EXX bottleneck (rewriting) (jun)
- use cuda streams for small RPA systems (jun)
- libxc integration (cpo)
- understand MKL benchmark (jun/cpo)
- pycuda (cpo)
- understand RPBE kernel: (lin)
- understand "double" problem
- vary np, block_size, nstreams
- loop testfunc many times
- longer term: look at jussi/samuli kernel for ideas
...