Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

*h2 Introduction

This Pipeline facility has five major functions:

  • automatically process Level 0 data through reconstruction (Level 1)
  • provide near real-time feedback to IOC
  • facilitate the verification and generation of new calibration constants
  • produce bulk Monte Carlo simulations
  • backup all data that passes through

The pipeline database and server, and diagnostics database have been specified.

The Pipeline can be expressed as five components:

  1. database access layer
  2. execution layer
  3. scheduler
  4. user interface
  5. relational database (management system): RDBMS

The scheduler is the main loop of the Pipeline. This long running process polls the database for new tasks, and dispatches processes to the execution layer.

The execution layer exists to abstract site specific details about how computing resources are invoked. It handles launching jobs and collecting output. At SLAC, this will be a thin wrapper around the LSF batch system toolchain. Other implementations will support simple clusters of machines using SSH for remote invocation, and single machine use where jobs are launched on same machine as scheduler.

The database access module contains all SQL queries and statements required by other parts of the system. By keeping the rest of the system from knowing anything about the database, we isolate from changes to both the schema and the database engine.

The RDBMS is an Oracle instance, hosted by SCS.