We would like to upgrade the xrootd server version for the Fermi xrootd cluster from version
20080828-1632 to 20090202-1402. The main changes between these two versions are:
Because of this currently the crawler is not using the production xrootd version but the test xrootd that runs the new version.
A xrootd client the first time it connects to a cluster tries FirstConnectMaxCnt times to connect before it will fail. The default for this number is 150 but for xrd.pl the is overwritten and set to 10. Therefore a client will fail after about 3.3 min (the wait between connection attempts is 20s) where as with the default setting the client will fail only after 50 min. This is import as for an outage which typically last from 5-30min we stop the redirector to avoid clients from being redirected and with the short wait time xrd.pl might fail.
As every xrootd version basic tests were done reading from and writing to xrootd, and testing the client admin interface (rm, stat, checksum,...).
The new version has been installed as a test version on the Fermi xrootd cluster which allows access to the glast data. The production crawler is using this version for more then a month.
Also skimmer jobs were run successfully run against this version.
The fix to the timeout for xrd.pl has been tested and verified that it will wait the expected time if a xrootd server is not available.
To switch the servers back to the old version the xrootd configuration has to be reverted to the old version followed by a restart of the old version.
The client version is rolled back by recreating the link to the old version.
requires to change in StartXrd.cf.glast the name to:
The restart should take less then five minutes. Stopping the redirectors first prevents clients being redirected and the chance that a file is not found because a data server is being restarted. The clients will wait during the restart and reconnect to the data servers and redirectors.