...
To the extent possible the options supported by the PilotJobDaemon are the same as those supported by the SlurmJobDaemon. Details below:
Workflow XML | Option | Alias | Default | Meaning | Comments |
---|---|---|---|---|---|
maxCPU | 1 hour | Max cpu used by the job (in seconds). | This is used for scheduling the job but is not currently enforced by the pilot. | ||
maxMemory | 1GB | Max memory used by the job (in kB). | This is used for scheduling the job but is not currently enforced by the pilot. | ||
batchOptions | -N | --nodes | 1 | The number of nodes on which the job will runOnly for compatibility with SLURM. Option is ignored. | |
batchOptions | -t | --time | 01:00:00 | The wallclock time allowed for the job | This is used for scheduling jobs in the pilot, but is not currently enforced by the pilot. |
batchOptions | -L | --license | none | The list of licenses required by the job separated by commas, e.g. -L SCRATCH | PilotJobs will only accept jobs if all licenses are available in the pilot job. |
batchOptions Accepted but not yet used. | -C | --constraint | none | The list of constraints required by the job, separated by commas, e.g. -C haswell | PilotJobs will only accept job if all constraints are satisfied by the pilot job. |
batchOptions Accepted but not yet used. | -p | --partition | none | The partition in which the job will be run. | Allows PilotJob to selectively run jobs submitted only far a particular partition. Parition names can be assigned by the user. |
batchOptions Only for compatibility with SLURM. Option is ignored. | -c | --cpus-per-task | 1 | The number of cpus (threads) which will be allocated to this job. | This is used for scheduling jobs in the pilot, but is not currently enforced by the pilot. |
batchOptions | --ntasks-per-node | 1 | Only for compatibility with SLURM. Option is ignored. | ||
batchOptions | -J | --job_name | The name of the job. | Only for compatibility with SLURM. Option is ignored. |
...
. |
Pilot Jobs
In the current implementation the pilot jobs are not submitted automatically, although this may change in future. Currently to submit the default pilot job simply login as user "desc" (separate instructions needed?) and run the following:
...
The options supported by the JobControlPilot are:
-C (--constraint) VAL : Constraints satisfied by this pilot
-L (--license) VAL : Licenses provided by this pilot
-
...
P N : The port that the pilot will attempt to pull jobs from (default: 0)
-c N : The total number of cores of share among all running jobs (default: 32)
-h
...
VAL
...
:
...
The
...
host
...
from
...
which
...
this
...
pilot
...
will
...
attempt
...
to
...
pull
...
jobs
...
(default:
...
...
N
...
:
...
The
...
time
...
after
...
which
...
the
...
pilot
...
will
...
die
...
if
...
no
...
work
...
is
...
provided
...
(seconds)
...
(default:
...
300)
-m
...
N
...
:
...
The
...
total
...
memory
...
(in
...
kB)
...
of
...
this
...
machine
...
to
...
share
...
among
...
all
...
running
...
jobs
...
(default:
...
64000000)
-o
...
:
...
True
...
if
...
OK
...
to
...
overwrite
...
existing
...
files
...
(default:
...
false)
-p
...
(--partition) VAL : If specified, only jobs requesting this partition will by run by this pilot
-r N : The maximum runtime for the job (seconds) (default: 172800)
-s
...
VAL
...
:
...
The
...
service
...
name
...
of
...
the
...
pilot
...
service
...
(default:
...
PilotJobProvider)
-u
...
VAL
...
:
...
The
...
user
...
name
...
under
...
which
...
the
...
pilot
...
service
...
is
...
running
...
(default:
...
desc)
Any number of pilot jobs can be submitted.
...
- Currently while jobs are running in the JobControlPilot, the memory and cpu time used will always be reported as zero, although when the job completes the CPU time used will be reported normally. This will be fixed soon.
- Currently if the PilotJobDaemon is stopped all information about running jobs will be lost. This will be fixed soon.These is no command line for killing running jobs, although jobs can be killed via the workflow engine (perhaps).
- There is currently no support for checkpointing jobs running in the JobControlPilot, although plans are in place to develop such a feature in future and most of the infrastructure required is already in place.
...