Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Under the Workflow dropdown, select Definitions. The Workflow Definitions tab is the location where scripts are defined; scripts are registered with a unique name. Any number of scripts can be registered. To register a new script, please click on the + button on the top right hand corner. 

Image RemovedImage Added

1.1.1) Name/Hash

...

The absolute path to the batch script. This script must contain the batch job submission command (bsub for LSF / sbatch for SLURM). It gives the user the ability to customize the the batch submission. Overall, it can act as a wrapper for the code that will do the analysis on the data along with submitting the job.

...

This defines where the analysis is done. While many experiments prefer to use the SLAC psana cluster (SLAC) to perform their analysis, others prefer to use HPC facilities like NERSC to perform their analysis. For now, only SLAC is supported.

1.1.4) Trigger

This defines the event that in the data management system that kicks off the job submission.

...

If the job is automatically triggered, it will be executed as this user. If the job is manually triggered; it will be executed as the user triggering the job manually.  This is set when creating the job definition and cannot be changed.

1.1.6) Edit/delete job

Use the edit/trash icons to edit a job definition or to delete a job definition

...

Code Block
languagepy
import os
import requests
requests.post(os.environ["JID_UPDATE_COUNTERS"], json=[ {"key": "<b>LoopCount</b>", "value": "75" } ])

2) Examples.

2.1) arp_submit.sh

The executable script used in the workflow definition should be used primarily to set up the environment etc and submit the analysis script to the HPC workload management infrastructure. For example, a simple executable script that uses LSF's bsub to submit the analysis script to the psdebugq queue is available here - /reg/g/psdm/tutorials/batchprocessing/jidarp_submit.sh

Code Block
#!/bin/bash
source /reg/g/psdm/etc/psconda.sh
ABS_PATH=/reg/g/psdm/tutorials/batchprocessing
bsub -q psdebugq -o "logs/%J.log" python $ABS_PATH/jidarp_actual.py "$@"

This script will submit /reg/g/psdm/tutorials/batchprocessing/jidarp_actual.sh  on py  on psdebugq and store the log files in /reg/d/psdm/dia/diadaq13/scratch/logs/<lsf_id>.  /reg/g/psdm/tutorials/batchprocessing/jidarp_actual.sh py will be passed the parameters as command line arguments and will inherit the EXPERIMENT, RUN_NUM and JID_UPDATE_COUNTERS environment variables.

2.2)

...

arp_actual.py

This Python script is the code that will do analysis and whatever is necessary on the run data. Since this is just an example, the Python script,   jidarp_acutalactual.py, doesn't get that involved. It is shown below.

Code Block
#!/usr/bin/env python
import os
import sys
import requests
import time
import datetime
import logging

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

logger.debug("In the jid_actual script - current time is %s", datetime.datetime.now().strftime("%c"))

for k, v in sorted(os.environ.items()):
    logger.debug("%s=%s", k, v)

## Fetch the URL to post progress updates
update_url = os.environ.get('JID_UPDATE_COUNTERS')
logger.debug("The URL to post updates is %s", update_url)

# These are the parameters that are passed in
logger.debug("The parameters passed into the script are %s", " ".join(sys.argv))
 
loop_count = 20
try:
    loop_count = int(sys.argv[1])
except:
    pass

## Run a loop, sleep a second, then POST
for i in range(10loop_count):
    time.sleep(1)
    logger.debug("Posting for step %s", i)
    requests.post(update_url, json=[{"key": "<strong>Counter</strong>", "value" : "<span style='color: red'>{0}</span>".format(i+1)}, {"key": "<strong>Current time</strong>", "value": "<span style='color: blue'>{0}</span>".format(datetime.datetime.now().strftime("%c"))}])


logger.debug("Done with job execution")
 

...

The logger.debug statements are sent to the job's log file. Note, one can form bsub/sbatch commands where the log output is not sent to a logfile and is instead sent as an email. Part of an example log file output is shown below.

No Format
DEBUG:__main__:In the jid_actual script - current time is Thu Apr 16 11:12:40 2020
...
DEBUG:__main__:EXPERIMENT=diadaq13
...
DEBUG:__main__:JID_UPDATE_COUNTERS=https://pswww.slac.stanford.edu/ws/jid_slac/jid/ws/replace_counters/5e98a01143a11e512cb7c8ca
...
DEBUG:__main__:RUN_NUM=26
...
DEBUG:__main__:The parameters passed into the script are /reg/g/psdm/tutorials/batchprocessing/jid_actual.py 100
DEBUG:__main__:Posting for step 0
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pswww.slac.stanford.edu:443
DEBUG:urllib3.connectionpool:https://pswww.slac.stanford.edu:443 "POST /ws/jid_slac/jid/ws/replace_counters/5e98a01143a11e512cb7c8ca HTTP/1.1" 200 195
DEBUG:__main__:Posting for step 1
...
DEBUG:__main__:Done with job execution

------------------------------------------------------------
Sender: LSF System 
Subject: Job 427001:  in cluster  Done
...

...