Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

We want to improve the robustness and reliability of the batch compute environment system by applying more rigid tighter resource controls. By running jobs in a "sandbox", they are protected The goal is to isolate jobs from each other and cannot consume all of prevent them from consuming all the resources on a machine. LSF version 9.1.2 makes use of linux Control Groups (AKA cgroups) to limit the CPU cores and memory that a job can use. These cgroup resource -based restrictions are currently not in our production LSF configuration. We want to understand the potential impact to users and get feedback from stakeholders. I have outlined some examples below using our test cluster.

CPU core affinity

Take a look at this simple script named mploadtest.csh. The script explicitly forks several CPU-bound tasks. In reality many users submit "single-slot" jobs that may behave in a similar manner or call API functions that spawn multiple child threads or processes:

...

Wait for the job to finish then resubmit to the same host but this time a host that has cgroups enabled. This time also we request CPU affinity in the job submission command: "bsub -q mpitest -m bullet0019 -R 'affinity[core:membind=localprefer]' ./mploadtest.csh". Observe this job again using the per-core load view with top. This time you should see all of the load is associated with a single core. The number of assigned cores will match the number of job slots so submitting the job with "-n 3" will result in the job using 3 CPU cores.

Memory limit enforcement

This C program named memeater consumes some physical memory in a fairly short time period. We basically allocate and reference 100MB every 2 seconds for 10 iterations = 1GB:

...

Compile the executable and submit as a job for a specific host that does not have cgroups enabled. We will also specify a physical memory limit of 350MB: "bsub -q systems -m pinto21 -M 350 ./memoryeater". Open a terminal session on the target batch host and run "top -a -u <your_uid>" to monitor processes listed in order of memory usage. The job should run for ~20secs and while it While the job is running you should see the amount of physical resident memory it consumes is using ("RES") incrementincrease. The job will run beyond the requested 350MB limit and probably run to completion complete and exit normally. This is because LSF is simply checking memory use periodically and jobs may exceed memory limits between these check intervals. Next, try submitting the same job with the 350MB limit but specify a host with cgroups enabled: "bsub -q systems -m bullet0020 -M 350 ./memoryeater". Observe the memory use on your chosen host as we did before using  "top -a -u <your_uid>". This time you'll see that LSF kills the job before as soon as the memory limit is exceeded. 

...