The pipeline can now be scheduled for shutdown by creating a file in the pipeline installation directory. The file must be named "shutdown_schedule" and must contain exactly 2 lines, both of which are dates in the form understood by the unix date command. When the first date has passed, the monitor (which runs every 5 minutes in cron) will shutdown the pipeline and not restart it until the second date has passed or the file has been removed. (The second date could also be changed to the current time in order to force the monitor to restart the pipeline on it's next execution.)
As an example, the following file will be used to turn off the pipeline during the Sept 30th computing center 1st-floor power outage:
[dflath@glastlnx13 prod]$ pwd
/afs/slac.stanford.edu/u/gl/glast/pipeline-II/prod
[dflath@glastlnx13 prod]$ cat shutdown_schedule
Wed Sep 30 04:25:00 PDT 2009
Wed Sep 30 17:00:00 PDT 2009
[dflath@glastlnx13 prod]$
See:
https://jira.slac.stanford.edu/browse/PII-398
And:
We would like to upgrade the xrootd server version for the Fermi xrootd cluster from 20090202-1402 to 20090721-0636.
The main reasons for the change is an improvement in the xrootd server and a configuration change:
As every xrootd version basic tests were done reading from and writing to xrootd, and testing the client admin interface (rm, stat, checksum,...).
The new version has been installed as a test version on the Fermi xrootd cluster which allows access to the glast data. Tests were performed to read and write to the new version. Reprocessing test jobs were successfully run against the server and the new version was also used for L1 tests.
The test xrootd has been setup for the directory removal (rmdir). It has been successfully used for some production testing.
To switch the servers back to the old version the production link has to be set to the old version and a restart of all xrootd servers is needed.
https://jira.slac.stanford.edu/browse/SSC-227
To allow production accounts (glastraw, glastxrw, glastmc and glast) to remove directory trees the xrootd forward method is used. The redirector will be configured to forward a rmdir request to all data servers. The data servers upon a request will execute a script that first checks if a directory is eligible for removal and then remove all files and directories below the specified directory. The xrootd configuration changes are:
To deploy a new xrootd version the following steps are required:
The restart should take less then five minutes. Stopping the redirectors first prevents clients being redirected and the chance that a file is not found because a data server is being restarted. The clients will wait while the xrootds are down and reconnect once the data servers and redirectors are up.
1.3.5 is built against a patched version of the Data-Handling-Common library which allows database connections to be removed from the connection pool as they age (and replaced with freshly created connections.) It also contains monitoring and run-time configuration capabilities.
The patch has been tested in DEV and works as expected.
This is intended to address the Memory leak we see on the Oracle server which slows down the pipeline software when the application has been running for some time. Since the Oracle Memory usage goes back down when the pipeline application is restarted, we feel that the problem is probably in the long-lived, cached connections.
Jira CCB Request: https://jira.slac.stanford.edu/browse/SSC-224