GPAW, Jacapo, ASE Information

GPAW Convergence Behavior

A talk given by Ansgar Schaefer studying convergence behaviour for rutiles is here (pdf).

General suggestions for helping GPAW convergence are here.

A discussion and suggestions for converging some simple systems can be found here.

Other convergence experience:

System	Who	Action
Graphene with vacancy	Felix Studt	Increase Fermi Temp from 0.1 to 0.2, use cg
Graphene with vacancy	Chris O'Grady	change nbands from -10 to -20, MixerDif(beta=0.03, nmaxold=5, weight=50.0)
Nitrogenase FeVCo for CO2 reduction	Lars Grabow	use Davidson solver (faster as well?), although later jvarley said MixerSum
Several surfaces	Andy Peterson	Broyden mixer with Beta=0.5
TiO2	Monica Garcia-Mota	MixerSum(0.05,6, 50.)
MnxOy	Monica Garcia-Mota	Broyden MixerSum
Co3O4	Monica Garcia-Mota	Davidson eigensolver, MixerSum(beta=0.05, nmaxold=5, weight=50) or MixerSum(beta=0.05, nmaxold=6, weight=100)

Other tricks:

To speed up the poisson solver, use something like gpts=h2gpts(h=0.18, atoms.get_cell(), idiv=8) to get a nicely divisible-by-large-power-of-2 grid. This helps the "multi grid" poisson solver.
experiment with Fermi smearing to help improve convergence
experiment with number of empty bands to help improve convergence

GPAW Planewave Mode

Jens Jurgen has a post here that discusses how to select plane wave mode in your script.

It looks like we have to manually turn off the real-space parallelization with the keyword:

parallel={'domain': 1}

In planewave mode I believe we also can only parallelize over reduced k-points, spins, and bands. We have to manually set the right numbers for these to match the numbers of CPUs.

To get parallelization over bands, we can at the moment only use the rmm-diis eigensolver (cg and davidson don't work).
The number of bands must be divisible by the number of CPUs, according to Jens Jurgen.
At the moment there is no dipole correction in planewave mode.
Density mixing is still done in real-space

GPAW Memory Estimation

The get a guess for the right number of nodes to run on for GPAW, run the
following line interactively:

gpaw-python <yourjob>.py --dry-run=<numberofnodes>
(e.g. gpaw-python graphene.py --dry-run=16)

Number of nodes should be a multiple of 8. This will run quickly
(because it doesn't do the calculation). Then check that the
following number is <3GB for the 8-core farm, <4GB for the 12-core farm:

Memory estimate
---------------
Calculator  574.32 MiB

Tips for Running with BEEF

If you use the BEEF functional:

the code parallelizes over 20 cores (because of the way the VDW contribution is calculated)
you typically need a lot of memory, even for small calculations. For larger calculations you sometimes have to use more than 20 cores, just to get enough memory (you don't get additional parallelization).
the GPAW memory estimates are incorrect

Building a Private Version of GPAW

Use svn to check out the version of GPAW that you want to use (described here).
copy /afs/slac/g/suncat/bin/privgpaw.csh into whatever directory you like
edit the two variables GPAW_BASE ("base" GPAW release that you want to re-use for numpy, mpi etc.) and GPAW_HOME (directory where you checked out GPAW)

Use the commands:

./privgpaw.csh build (build)
./privgpaw.csh test (run gpaw self-tests)
./privgpaw.csh gpaw-bsub <arguments> (run in batch)
./privgpaw.csh <cmd> (run <cmd> interactively, using privgpaw.csh environment.  e.g. "gpaw-python junk.py")

Some notes:

You can see a list of available installed versions here (to use for GPAW_BASE).
The syntax for the "bsub" option is identical to the gpaw-bsub command described here.
The above assumes that the base release is compatible with your checked out GPAW version. Talk to cpo if you have questions about this.
The above doesn't include support for a private version of ASE. When that becomes important we will add it.

Jacapo Parallel NEB Example

You can find a Jacapo parallel NEB example here. Some lines need to change for a restart. An example is here.

Some important notes:

the number of processors selected for a parallel NEB must be an integer multiple of the NumberOfImages-2 (the images at the end points are "fixed").
Parallel NEB settings (e.g. number of cores) can be debugged running in the suncat-test queue with 1 processor per non-fixed image.
this restart example turns on the "FIRE" and and "climb" parameters: this is for later in the calculation. The documentation here discusses the reasons for that (although perhaps doesn't spell out clearly what the criteria are for turning those on).

Child pages