Skip to end of metadata
Go to start of metadata

Restrictions

VASP is restricted to licensed users. It may not be used in contractual research in cooperation with industry or military.

Submitting Batch Jobs

See the bottom of this page for available versions.

The first is the ASE style. The second is native VASP style, and assumes all input files (POTCAR, INCAR, etc.) are created by the user are in the current directory. POTCARs can be assembled from the directory /nfs/slac/g/suncatfs/sw/vasp/pseudo52.

Example script

Thanks to Joel Varley for providing this BEEF example.

Versions

Version

Date

Comment

1

12/5/2012

initial version

2,2a

1/31/2013

add BEEF support courtesy of Johannes Voss ("a" for suncat3 farm)

3

1/31/2013

like v2, but change pseudo potentials to the "52" versions. no suncat3 version yet

4

2/28/2013

like v3, but add VTST support

5

3/15/2013

includes BEEF, VTST, and 30% performance improvement from using fftmpiw.F

6

7/9/2013

copy of v5 but without -DNGZhalf flag. requested by junyan for a non-collinear calculation. not recommended for general use

Parallelization

This information comes from Yian Zhu, who ran several 3x3x4 spin-polarized Nickel calculations with the PBE functional.

  • VASP parallelizes over bands by default, but can also parallelize more efficiently over plane waves (controlled by the "NPAR" parameter, which by default is set to the total number of cores).
  • #cores used per-band is totalNumberOfCores/NPAR
  • Yian has measured how to set NPAR for the suncat 8-core farm for his 3x3x4 spin-polarized Nickel calculation. For one node NPAR should be set to 1 (planewave parallelization). For more nodes, NPAR should be set to approximately sqrt(numberOfCores), at least for Yian's system. See Yian's plot below.
  • band parallelization requires less interprocess communication than planewave parallelization, but still requires non-trivial inter-node communication
  • for Yian's system, even with an optimal setting of NPAR, he only sees an improvement from 3777 seconds to 3092 seconds going from 1 to 2 nodes, so it is still significantly more cpu-efficient to run on one node.

  • No labels