...
- looking at EXX bottleneck (rewriting) (jun)
- use cuda streams for small RPA systems (jun)
- libxc integration (cpo)
- understand MKL benchmark (jun/cpo)
- pycuda (cpo)
- understand RPBE kernel: (lin)
- understand "double" problem
- vary np, block_size, nstreams
- loop testfunc many times
- longer term: look at jussi/samuli kernel for ideas
...