GPAW Convergence Behavior
A talk given by Ansgar Schaefer studying convergence behaviour for rutiles is here (pdf).
General suggestions for helping GPAW convergence are here.
A discussion and suggestions for converging some simple systems can be found here.
Other convergence experience:
System |
Who |
Action |
---|---|---|
Graphene with vacancy |
Felix Studt |
Increase Fermi Temp from 0.1 to 0.2, use cg |
Graphene with vacancy |
Chris O'Grady |
change nbands from -10 to -20, MixerDif(beta=0.03, nmaxold=5, weight=50.0) |
Nitrogenase FeVCo for CO2 reduction |
Lars Grabow |
use Davidson solver (faster as well?), although later jvarley said MixerSum |
Several surfaces |
Andy Peterson |
Broyden mixer with Beta=0.5 |
TiO2 |
Monica Garcia-Mota |
MixerSum(0.05,6, 50.) |
MnxOy |
Monica Garcia-Mota |
Broyden MixerSum |
Co3O4 |
Monica Garcia-Mota |
Davidson eigensolver, MixerSum(beta=0.05, nmaxold=5, weight=50) or MixerSum(beta=0.05, nmaxold=6, weight=100) |
MnO2 with DFT+U U=+2eV |
Monica Garcia-Mota |
Marcin suggests we disable the DipoleCorrectionPoissonSolver (not yet tested) |
MnO2 with DFT+U U=+2eV |
Monica Garcia-Mota |
Henrik Kristofferson suggests: convergence is easier with high U (U=4eV) and then |
MnO2 with DFT+U U=+2eV |
Monica Garcia-Mota |
(from Heine) increase U in steps of say 0.1 (or smaller) and reuse the density and/or wave functions from the previous calculation? This tends to reduce the problem of being trapped in meta-stable electronic states, and it also makes convergence easier. Monica later reported that this helped. |
Cu |
Ask Hjorth Larsen |
first mixer parameter should probably be 0.1 for faster convergence, because it has a low DOS at the Fermi level. (Other transition metals may require lower values.) |
N on Co/Ni (with BEEF) |
Tuhin |
rmm-diis and MixerSum(beta=0.1, nmaxold=5, weight=50) |
Other Tricks:
- To speed up the poisson solver, use something like gpts=h2gpts(h=0.18, atoms.get_cell(), idiv=8) to get a nicely divisible-by-large-power-of-2 grid. This helps the "multi grid" poisson solver.
- experiment with Fermi smearing to help improve convergence
- experiment with number of empty bands to help improve convergence
be sure to specify nbands, otherwise GPAW will add "plenty" of bands which is very expensive in FD calculations. nbands=6*[number of atoms] should be more than enough.
GPAW Planewave Mode
Jens Jurgen has a post here that discusses how to select plane wave mode in your script.
It looks like we have to manually turn off the real-space parallelization with the keyword:
parallel={'domain': 1}
In planewave mode I believe we also can only parallelize over reduced k-points, spins, and bands. We have to manually set the right numbers for these to match the numbers of CPUs.
- To get parallelization over bands, we can at the moment only use the rmm-diis eigensolver (cg and davidson don't work).
- The number of bands must be divisible by the number of CPUs, according to Jens Jurgen.
- At the moment there is no dipole correction in planewave mode.
- Density mixing is still done in real-space
GPAW Memory Estimation
The get a guess for the right number of nodes to run on for GPAW, run the
following line interactively:
gpaw-python <yourjob>.py --dry-run=<numberofnodes> (e.g. gpaw-python graphene.py --dry-run=16)
Number of nodes should be a multiple of 8 for the suncat farm,
multiples of 12 for the suncat2 farm. The above will run quickly
(because it doesn't do the calculation). Then check that the
following number is <3GB for the 8-core suncat farm, <4GB for the 12-core suncat2 farm:
Memory estimate --------------- Calculator 574.32 MiB
Tips for Running with BEEF
If you use the BEEF functional:
- use xc='BEEF-vdW'
- the code parallelizes over 20 cores (because of the way the VDW contribution is calculated)
- you typically need a lot of memory, even for small calculations. For larger calculations you sometimes have to use more than 20 cores, just to get enough memory (you don't get additional parallelization).
- the GPAW memory estimates are incorrect
- it is typically best to run on the suncat2 (more memory per core) or, even better, the suncat3 farm (more memory per core, plus a much faster infiniband node-node interconnect)
Building a Private Version of GPAW
- Use svn to check out the version of GPAW that you want to use (described here).
- copy /afs/slac/g/suncat/share/scripts/privgpaw.csh into whatever directory you like
- edit the two variables GPAW_BASE ("base" GPAW release that you want to re-use for numpy, mpi etc.) and GPAW_HOME (directory where you checked out GPAW)
- Use the commands:
./privgpaw.csh build (build) ./privgpaw.csh test (run gpaw self-tests) ./privgpaw.csh gpaw-bsub <arguments> (run in batch) ./privgpaw.csh <cmd> (run <cmd> interactively, using privgpaw.csh environment. e.g. "gpaw-python junk.py")
Some notes:
- You can see a list of available installed versions here (to use for GPAW_BASE).
- The syntax for the "bsub" option is identical to the gpaw-bsub command described here.
- The above assumes that the base release is compatible with your checked out GPAW version. Talk to cpo if you have questions about this.
- The above doesn't include support for a private version of ASE. When that becomes important we will add it.
Jacapo Parallel NEB Example
You can find a Jacapo parallel NEB example here. This same script can be used for a restart (the interpolated traj files are only recreated if they don't exist).
Some important notes:
- the number of processors selected for a parallel NEB must be an integer multiple of the NumberOfImages-2 (the images at the end points are "fixed").
- Parallel NEB settings (e.g. number of cores) can be debugged running in the suncat-test queue with 1 processor per non-fixed image.
- Later on in the NEB (discussed here) it is good to set climbing=True and switch to the FIRE method, shown here. As best I can tell, this page doesn't provide good instructions for when to turn those on.
Perturbing Jacapo Calculations
When taking an existing Jacapo calculation and making changes (e.g. adding an external field) it is important to not instantiate a new calculator (to work around some Jacapo bugs) but instead read in the previous atoms/calculator from the .nc file. Johannes Voss has kindly provided an example with some comments here.