Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • try nvvp/transpose (or C60 with more grid points) for >5 minutes (lin)
  • send mail to nvidia or list to understand why nvvp profile cuts off after 5 minutes (lin)
  • understand bottleneck in get_wfs (jun)
  • implement fft/gemv (cpo)
  • is there a cuda library for trace like zgeev (cpo)
  • run a 3x3x3 system to see if bottlenecks stay the same (cpo)
  • driver hang status (cpo)
  • understand how to fix gs.py bottlenecks in more detail (lin/cpo) using gpaw profiler:
    • pseudo density: density.py: self.calculate_pseudo_density(wfs) (cpo)
    • projections: overlap.py: wfs.pt.integrate(psit_nG, P_ani, kpt.q) (cpo)
    • RMM-DIIS: eigensolvers/rmm_diis.py: lots of lines (cpo)
    • projections: eigensolvers/rmm_diis.py: wfs.pt.integrate(dpsit_xG, P_axi, kpt.q) (lin)
    • calc_h_matrix: eigensolvers/eigensolver.py: H_nn = self.operator.calculate_matrix_elements, hamiltonian.xc.correct_hamiltonian_matrix (lin)
    • rotate_psi: eigensolvers/eigensolver.py (lin)

...