Default esub wrapper

bsub is a script written by Neal Adams which calls the "real" executable:

/afs/slac/package/lsf/curr/bin/bsubx

Depending on the "-a" option (for suncat this is typically "openmpi") bsubx calls an "esub" script (in /afs/slac/package/lsf/etc.slac). This in turn points to another wrapper script in the bin.slac directory. For "-a mympi" the script is "mympirun_wrapper". This last one is the one that executes the mpirun command. mpirun uses "lsgrun" on the master node to direct the "res" daemons on the slave nodes to start executables.

Customized esub wrapper

A command like the following allows use of a "custom" esub wrapper:

/afs/slac/package/lsf/curr/bin/bsubx -q suncat-test -o pt.log -e pt.err -n 16 pam -g /afs/slac/g/suncat/bin/suncat-tsmpirun gpaw-python pt.py

where suncat-tsmpirun has been copied from /afs/slac/package/lsf/bin.slac/mympirun_wrapper and modified.

The output of pstree looks like:

     |-sbatchd---res---1331246084.9012---pam---suncat-tsmpirun---mpirun---8*[TaskStarter---gpaw-python]

The PAM executable is key, since all lsf control/monitoring of the subprocesses go through that process.

  • No labels