For long running jobs, you can submit jobs into our batch farm environment to make use of the high performance compute and storage at SLAC.
In order to submit a job into the batch queues, it it recommend that you test out the job(s) on the interactive node at
ssh ocio-gpu01.slac.stanford.edu
When you're ready and have a command line or series of commands that you need to perform; you should create a batch script (using vi, emacs, nano etc) with something like the following:
#!/bin/bash -l #BSUB -a mympi #BSUB -P cryoem #BSUB -J my_batch_job_name #BSUB -q slacgpu #BSUB -n 4 #BSUB -R "span[hosts=1]" #BSUB -W 72:00 #BSUB -e run.err #BSUB -o run.out #BSUB -B # setup env source /etc/profile.d/modules.sh export MODULEPATH=/usr/share/Modules/modulefiles:/opt/modulefiles:/afs/slac/package/singularity/modulefiles module purge module load PrgEnv-gcc/4.8.5 module load relion/3.0 # change working directory cd <datadir> # run the command relion_reconstruct --i Refine3D/job103/run_ct31_data.star --o Reconstruct/ewald1/half1_class001_unfil.mrc --subset 1 --angpix 2.14 --ctf --ewald --mask_diameter 696 --sectors 2 --width_mask_edge 5 --sym I1 > Reconstruct/ewald1/reconstruct_half1_ewald.log
The precise content of the batch script depends on what you wish to do; in the above, we've request 4 'slots' (which maps ~1:1 to a cpu core) with the -n flag and we've also requested them all all on the same host (span[hosts=1]).
we are submitting into the slacgpu queue, have requested no gpus and said that we don't expect this job to run longer than 72 hours.
You should change the last two lines to suit the job that you are wanting to run.
If you are running a singularity container, you should use something like this:
singularity exec -B /gpfs,/scratch <path_to_container_file> python <python script> <args>
When you're ready to run, use the following (note the linux redirection of batch file into the bsub command).
bsub < batch_file
to monitor your jobs, you can run
bjobs
you will get email notification of the progress of your job to your SLAC unix account (with the -B argument).