Submitting Batch Jobs
The are the commands for ASE/python mode and the "native" (no ASE/python) mode:
esp-ver-bsub <version> myscript.py esp-ver-bsub-native <version> -q suncat-test -o my.log -n 8 pw.x -in pw.inp
Dealing With Swapping Jobs
For a "typical" espresso job (default planewave parallelization):
- if a job swaps on suncat (24GB nodes), run it on suncat2 (48GB nodes)
- if a job swaps on suncat2 (48GB nodes), run it on suncat3 (64GB nodes)
- if a job swaps on suncat3, use 2 suncat3 nodes. suncat3 (only!) has a fast interconnect that should help them run reasonably well on multiple nodes
In the longer term we would like to have a memory estimator that will allow you to choose the best queue in advance, although posts on the espresso mailing list suggest this may be difficult.
k-point Parallelization
- NOTE: typically one does NOT do k-point parallelization for large systems. Only the gamma-point is necessary.
- k-point parallelization across nodes will not be as cpu-efficient as planewave parallelization within one node, so use it judiciously
- k-point parallelization is not as memory efficient as planewave parallelization, but it is supposed to scale better to more nodes (ask cpo if you want a better explanation). In particular, my understanding is that k-point parallelization will not reduce the memory usage per node.
- vossj and cpo have not yet seen good scaling behavior for the k-point parallelization for small systems (2x2x3 system). lausche has reported good k-point scaling for 3x3x4 systems. there have been some not-understood hangs with npool=3 or 4 (see below).
- to turn on k-point parallelization:
- for ase mode: add parameter "parflags='-npool 2'" to the espresso object. This is a general-purpose string for passing run-time options to espresso executables.
- for native mode: add something like "-npool 2" at the end of the line
- an example for 16 cores (2 nodes) and npool=2: each of the 2 pools of 8 cores would parallelize over planewaves, but the 2 pools would process pairs of k-points in parallel. If one had 9 k-points, they would get processed in pairs, but the last one would only be processed on one node, leaving the other idle, which is not ideal.
- if you have done it correctly, you should see a line about "K-points division" in your espresso log file (the planewave parallelization produces a line like "R & G space division")
- there is a chicken-and-egg problem: to run your job one needs to know the number of reduced k-points (to determine npool) however one has to run the job to learn what this number is. a workaround for this would be to run it first in the test queue to learn the reduced number of k-points.
Reducing Memory Usage for Large Systems
From http://www.democritos.it/pipermail/pw_forum/2008-January/008101.html
Excerpt (relevant for the "native" (non-ASE) mode):
- consider reducing the planewave cutoff, IF it won't affect your results too much
- is your system an isolated system? then use this keyword:
it will use k=0 only (which is all you need for an isolated system) and exploit various tricks to reduce memory usage
K_POINTS gamma
- setting option "diago_david_ndim" to the minimum (2) and "mixing_ndim" to a smaller value (4) reduces memory usage, but may increase CPU time
- using diagonalization='cg' will also reduce memory usage, but it will increase CPU time by a sizable amount
- do not calculate stress if you do not need to: it is expensive
In ASE-mode we are currently unable to set the "K_POINTS gamma" field and "diago_david_ndim". The other two can be specified with the convergence keyword. "mixing_ndim" is "mix", and "diagonalization" is "diag". For example:
convergence = {'energy':1e-6, 'mixing':0.7, 'maxsteps':100, 'mix':4, 'diag':'cg'},
Example Scripts
A simple optimization: esp.py
Calculate density-of-states: espdos.py
Plot density-of-states: espdosplot.py
NEB: espneb.py
Versions
Version |
Date |
Comment |
1 |
12/3/2012 |
initial version |
2 |
12/5/2012 |
use mkl fftw |
3 |
12/7/2012 |
UNSTABLE version: developers allowed to change espresso.py. Users can overwride espresso.py by putting their own espresso.py in directory $HOME/espresso |
4,4a |
12/10/2012 |
update to the latest svn espresso-src and espresso python |
5 |
2/14/2013 |
Entropy corrections added and default parameters changed (smearing type and width) |
6,6a |
3/7/2013 |
Many changes: move to combination of dacapo/espresso pseudo potentials (previously just dacapo), add spin polarized BEEF |
7,7a |
4/5/2013 |
Update the python interface for bug fixes. Numbers shouldn't change from v6 |
8,8a |
4/5/2013 |
Important bug fixes: no need for calc.stop(), support for kpoint parallelization with ASE, fix for rhel5 nfs auto mount problem. Numbers shouldn't change from v6/v7. |
9,9a |
5/28/2013 |
Add PDOS/NEB calculations. new libbeef interface allows for adding additional beef functionals in future. |
10,10a |
6/19/2013 |
Fix problem with pipe buffering that crashed NEB. Dump more information about python/fortran executables to output. |
11 |
7/2/2013 |
Bug fix for end of job race condition giving "broken pipe" error |
SUNCAT Quantum Espresso Talks
Introduction/Usage (Johannes Voss): jvexternal.pdf
Accuracy (Jewe Wellendorff, Keld Lundgaard, NOTE: password protected because it contains VASP benchmark data): kelu.pdf
Speed/Convergence (AJ Medford): aj.pptx
Scaling behavior (Christopher O'Grady): espscaling.pptx
Private Espresso Builds
Copy this script, and then edit the appropriate lines at the top:
/afs/slac/g/suncat/share/scripts/privesp.csh
Espresso ASE To-Do List
- merge branch with espresso trunk
- become part of ASE svn (need to follow new ASE guidelines)
- dry-run mode to get memory estimate
- understand failing espresso tests
- record uspp and executable directory in output (and/or svn version, somehow?)
- neb (done)
- constraints interface to pass ASE constraints to espresso
- dos (done)
- bandgaps (done)
- separation of site-specific code from ASE code (including site-specific "scratch") (done)
- make beef errors accessible from ASE
- beef self-tests integrated with espresso self-tests
- support kpoint parallelization (done)
- look into other parallelization (openmp, scalapack)
- documentation/examples (including on ASE website)
- fix popen warnings on suncat3
- can we eliminate os-dependent stuff, like grep/egrep/sed?
- eliminate need for calc.stop() with multiple calculations (done)
- get work function without dumping out the electrostatic cube file? (chuan has tools for this)
- dipole correction goes in the middle of unit cell by default (in python, chuan makes sure it goes in the biggest gap) (done)