You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Below please find the questions the main SLAC batch system users and administrators posed:

User Group 1
  • Automatic job preemption/suspend/resume?
  • Support for multiple-levels of job preemption (e.g. 3-queue hierarchy)?
  • Job environment propagation (including limits like "stacksize")?
  • Subgroup-specific priority calculation (queue-specific priority formula)?
  • Capability to delegate subgroup administration privileges (adjust job priorities, suspend, resume, kill) to subgroup administrators?
  • Cross-queue fairshare (with cpu-speed weighting)?
  • CPU advanced reservations for MPI?
  • GPU support?
  • Ability to submit jobs to hosts where we don't have accounts/home-directories?
  • Avoid bad behavior when MPI head node reboots: slave node processes get "forgotten" ?
User Group 2
  • Please list supported operating systems (for submission hosts and for execution hosts)
  • explicit, site-specific-naming resource specification (e.g., rhel5-64, amount of /scratch space, etc.) at job submission time?
  • API for "time remaining" query (to save state near end of job)
  • Do submission / management hosts a license?
    • ability to submit/monitor jobs from any machine at SLAC (not just those with licenses as is the case with LSF)
User Group 3
  • Resource arbitration (i.e. ability to easily control number of running jobs to prevent overloading of disks and other resouces, such as memory)
  • Reliable notification to job when CPU, other resources, about to be exhausted
  • Built-in support for virtualization
    • Ability to suspend/resume/move running jobs (MPI and single-core)
  • API (preferably rest/xml/json) for submitting, monitoring, controlling jobs, in addition to command-line control
  • Can resource allocation within a subgroup be delegated to subgroup administrators?
    • e.g. today I want 90% of resources dedicated to one subgroup within my allocation, tomorrow only 10%
  • Hooks for adding extensions (e.g. kerberos tickets or afs token support) without introducing maintainance problems
User Group 4
  • Knowledge of cluster topology to assign large jobs across nodes in optimized way.  For example an MPI job with 256 cores would best be assigned on as few nodes as possible and those nodes should span as few Infiniband/ethernet switches as possible.
  • Ability to request exclusive access to nodes: this is to support using the full node's memory on a subset of the nodes cores
  • Ability to schedule using node properties and resources: memory, gpu cores, cpu cores, i/o connectivity, other user defined properties.
  • Job priority schemes that support coexistence of small and large jobs.  This likely includes issues of pre-emption, backfill, checkpointing, migration and restarting.  Also includes aggregation of single core jobs onto nodes with other similar jobs.
  • Ability to partition the cluster / support multiple cluster (by host capabilities, such as IB-capable, GPU nodes, other per-node attributes)?
  • Ability to automatically enable preemption at certain times of day?

User Group 5

  • Is the batch system supported by Open Science Grid?

User Group 6

  • Software should be installed via RPM or similar, both in terms of reducing the reliance on a shared file system, and in terms of simplification of system install process
  • Per-node configuration should be fairly simple, and easily manageable via a configuration management tool (e.g. chef).
  • Servers must support semi-high-availability, at least as much as our existing LSF setup (at least 2 redundant batch-manager nodes).
  • Authorization information (e.g. list of authorized users) can be specified from multiple simultaneous sources (e.g. NIS, LDAP, some flat files) in a fairly logical manner.(for example. a local flat file can override a central LDAP list)
  • Where possible, we want to be able to perform meaningful queries regarding current queue and job states and get pretty-and-useful reports back.  This is true both for command-line operations *and* web operations.
  • From the Unix side, interaction should occur over well-documented and consistent ports (so that we can handle firewalls properly).

User Group 7

  • Integration with AFS (i.e. it must be possible for single-node and MPI jobs to acquire the user's kerberos and AFS identities)?
    • Does the system (at a minimum) provide hooks to implement such a feature?
  • Real-time and historical monitoring , accounting and analytics (performance, utilization, resource consumption, etc) available through web interface, command line, other APIs (e.g. for embedding in other web applications).
    • It is fine if this is an add-on capability
  • Short list of  resource allocation / scheduling schemes (such as fairshare, backfill, etc.) with brief explanations
  • Job submission and management through a web portal?

User Group 8

  • How are licenses managed (i.e. is there a requirement for a license server, if yes, does it allow for redundancy)?
  • Ability to schedule jobs based on file server load?
  • No labels