Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Success is expected to be incremental and user dependent. Job statistics can be prepared from the LSF logs and we can chart the throughput and latency for various classes of parallel jobs. Interpreting this will require input from the parallel job users. Information is being gathered now for this purpose. A review of these statistical measures can be made in early December.

strawman for success metric:

  • throughput: ~1000 parallel slots in use when demand is sufficient
  • latency: 1-2 hours for <=128 cores; <1day for <=512 cores; <2days for ~1024 cores (TBD if special arrangements needed)
  • runtime limits: if you wait 2 days to run and only get a short time the duty cycle is poor so some thought needed here