You are viewing an old version of this page. View the current version.
Compare with Current
View Page History
« Previous
Version 3
Next »
Pipeline II
Pipeline II is an upgrade of the existing pipeline server. It likely to be a somewhat scaled back version of our earlier plans for Pipeline NG. It should use a database schema similar to the current pipeline, but extended to handle new requirements.
Tasks
- Gather requirements
- Design enhanced database requirements
- Design new XML input format
Requirements
- Task scheduling should be more flexible that current linear chain
- Should support parallel execution of tasks
- Should allow dependency chain to be more general the file requirements
- Should support parallel sub-tasks, with number of sub-tasks defined at runtime
- Perhaps support conditions based on external dependencies
- Should allow for remote submission of jobs
- Perhaps using GRID batch submission component, or Glast specific batch submission system
- Will need to generalize current system (e.g. get rid of absolute paths)
- Support reprocessing of data without redefining task
- Need way to mark Done task as "ReRunnable"
- Need to support multiple versions of output files
- Ability to Prioritize tasks
- Ability to work with "disk space allocator"
- Would be nice to set parameters (env vars) in task description
- Would be nice to be able to pass in parameters in "createJob"
Things Tony doesn't understand about current system
- What is the purpose of batch-configuration (seems to muddle several concepts)
Current Action Items
- Get dump of current SQL schema
- Tony needs to learn to use DbVisualizer