|
Processing Detail View by Task
Last changed: Mar 31, 2005 18:07 by Daniel Flath
This post was created to provide details on the Processing Detail View (PFE-40) and access to PDF docs on the subject. A PDF file showing details of the feature request is available here: http://confluence.slac.stanford.edu/download/attachments/680/processing-view.pdf?version=1
Thoughts on pipeline post Data Handling Meeting
Last changed: Jan 16, 2005 17:01 by Richard Dubois
Here are what I take to be the primary issues that need addressing in the next iteration of the pipeline: High level
Mid Level
Details
More to come
svac pipelines doc
protecting output files
archiving strategy
This is another item we will need very soon. Navid has made a perl interface to the archive system. But now the issue is how and when to archive. Since the I&T pipelines are parallel, they have several different named tasks operating on the same run, writing to the same directory. Additionally, not all files are reported to the pipeline, but they are wanted to be archived. So we have to archive the entire directory. Ideally we would prevail on everyone to identify every file they want archived, but we seem to be on the losing end of that one! I think this means the archiving has to happen asynchronously to the pipeline. I'd be curious to see a comment to this blog item from Dan with his thoughts on the algorithm for figuring out what to archive and when. Note that SCS asks that we keep the files larger than 500 MB to use the tapes efficiently. So we had been thinking to make tar files. Navid keeps track in his archiver db of the file content inside the tar file, so he can ask the archive system for the right tar file when someone asks for an individual file. I'm at a bit of a loss at the moment to divine a way to know when one can archive in a general way. It would be nice not to need custom archiving per group of tasks.
wrappers and code versions
It can be hard to find the actual code that does the work. In my recent allGamma-GR-v5r0p2 task, I have one task process configured as:
I realize that gleam.pl is not necessary; GleamWrapper.pl could have easily done the work. I had based my task on Warren's recon task, where his 'gleam'pl' builds the shell script, and slavishly kept his structure. But I realized that nowhere do we record the version of the underlying code that is run: nowhere in the database do we actually record the version of GlastRelease. We do have a spot to record the version of GleamWrapper.pl (the only executable Gino knows about per task process), though the xml configurator does not allow setting this version. For some executables - and GlastRelease is an important one - we could use the version number to access the code. The Release Manager builds the releases and maintains a database giving access to them. It would be good to both record the important version number and allow it to be found automatically rather than (eg) hardwired into my shell script as I'm doing now.
more Gino use cases
There are 2 more use cases I expect we will need to handle:
Gino as a server
At the moment Gino is run from cron. When it wakes up, it checks to see if an instance of itself is already running, and exits if it is, to not step on itself. Gino also is fairly verbose in generating a log file (that blew the glast04 /pipeline/ partition last week and has been moved to u12/pipeline/). The log is rather hard to parse since Gino spits out processes that write to the log asynchronously. And huge. If we want to check aliveness with the resource checker, the best option currently is to check the last touched date on the log file. The Gino process itself cannot respond to queries. Matt has suggested we move towards a java server, initially wrapping the scheduler perl script. He says the java wrapper can (out of the box, more or less, I think): I imagine there are other features I am forgetting. It would be nice if Matt could elaborate, giving a fuller feature list, a pointer to further reading and perhaps a simple demonstration example that wraps a perl script issuing a print statement or two?
task description and user supplied batch options
I notice there is no place in the task table for a description of the task. Would be nice. Also, it could be useful to allow the user to add options to the bsub command. One that comes to mind immediately is the -R option. One might be willing to trade time waiting for a job to start for the x2-3 gain in CPU between the barb and noma batch workers. Of course both would have to be configurable from the web front end.
New status codes and front end filters requests
Last changed: Dec 18, 2004 14:10 by Richard Dubois
Posted from Warren's pipelinelist entry http://www-glast.stanford.edu/protected/mail/pipeline/0140.html A new run status, OldFail, or AcknowledgedFailure or something, which I could manually set runs/processes in the current Fail state to after investigating the failure. This would simplify debugging as I wouldn't have to wade through piles of old failures to find the (hopefully) few new ones. The ability to filter the run list on run id (a list of ranges would be The ability to filter the run list based on run status. Ideally I'd be able to OR and NOT them as well.
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||
