News from May, 2008

  2008/05/14
Request to install a new Xrootd production version
Last changed: May 16, 2008 13:29 by Wilko Kroeger
Labels: sassoccb

First draft

Reason for change

The current production version is 20071101-0808p2. We would like to install the new production candidate version 20080513-1222.

The new xrootd version allows to restrict the access to the GLAST data which is a requirement. The current production version does not support access control. The new xrootd version has various improvements and fixes that make the system more reliable.

  • allows access control to the data which is required.
  • fix for checksum problems that could crash the server.
  • better handling of redirection for existing files.
  • other small bug fixes.

Testing

The production candidate version 20080513-1222 and its predecessor version 20080410-0747 have been deployed on the glast xrootd cluster running parallel to the production xrootd daemons. This allows using the new version on the glast xrootd cluster without interfering with the production version.

An update from 20080410-0747 to 20080513-1222 was needed as a critical bug was found (https://jira.slac.stanford.edu/browse/GXR-33) that caused the server to crash when re-reading the updated authorization file. Otherwise there are only small bug fixes between 20080410-0747 and 20080513-1222 that don't affect GLAST.
The test setup on the glast xrootd cluster used the configuration that would be used in production, in particular authorization has been turned on. Only glast users are allowed to read data and certain production accounts are allowed to write to xrootd.

The basic functionality of the new xrootd version has been examined in a test cluster. The testing included reading/writing with xrdcp, stating files and checksumming of files.

All other test were done against the new version 20080410-0747 and repeated for production candidate version 20080513-1222 on the glast xrootd cluster:

The main tests were:

  • running the crawler
  • MC-simulation
  • skimmer jobs
  • reading and writing with xrdcp
  • reading from xrootd with root version 5.18.00(b)

and all tests passed.

The test that required the xrootd client tools (xrdcp to transfer files, xrd.pl to stat,checksum and remove files) were done with using the one from the new version.

Rollback

It is possible to re-activate the old xrootd version, no data will be lost.

CCB Jira

SSC-43@JIRA

Details

Deploying the new xrootd version requires to restart the new version on all glast xrootd servers, including the redirectors (glast-rdr). In addition the client admin tools in /afs/slac/g/glast/applications/xrootd/PROD have to be updated as the current version does not support authorization.

The steps are:

  1. Install new client admin tools:
    mv /afs/slac/g/glast/applications/xrootd/TEST to /afs/slac/g/glast/applications/xrootd/PROD
    The new version works with the old xrootd servers and this move doesn't cause any disruption to clients.
  2. Stop xrootd on all data servers
  3. Stop xrootd on the redirectors (glastlnx04/05)
  4. Activate the new version on the redirectors and restart them.
  5. Activate the new version on the data servers and restart them.

The whole procedure should take less then five minutes and clients will retry to access xrootd during that time and not fail. User will be informed when the restart will happen in order to reduce the load on xrootd, just as a precaution.

Posted at 14 May @ 9:13 AM by Wilko Kroeger | 0 Comments
  2008/05/15
CCB Request for GroupManager 1.8
Last changed: May 15, 2008 23:58 by Tony Johnson
Labels: sassoccb

Reason for change

The group manager has been updated with very minor changes to support better interaction with the Shift Schedule application. Since the shift schedule app is to be announced tomorrow these changes have already been released, although they can be trivially backed out if need be.

Test Procedure

These changes have all been tested in the dev server http://glast-tomcat03.slac.stanford.edu:8080/GroupManager

Rollback procedure

It will be easy to rollback these changes should any problem occur.

Related JIRA

SSC-45@JIRA

Details

Posted at 15 May @ 11:56 PM by Tony Johnson | 0 Comments
  2008/05/17
CCB Request to release Job Control 1.7
Last changed: May 17, 2008 16:53 by Tony Johnson

Reason for change

The Job Control daemon is used by Pipeline II to submit jobs to LSF and BQS (at Lyon). This minor change adds a management interface to support monitoring the daemons, and adds improved error reporting when an error occurs during job submission. Neither of these changes will have any effect on operation, but will make tracking down problems easier.

Test Procedure

These changes have been tested on a DEV version of the server

Rollback procedure

It will be easy to rollback these changes should any problem occur.

Related JIRA

SSC-46@JIRA

Details

Posted at 17 May @ 4:50 PM by Tony Johnson | 0 Comments
  2008/05/27
CCB Request to release new Pipeline II, Data Catalog
Last changed: May 28, 2008 13:12 by Tony Johnson
Labels: sassoccb

Reason for change

This new release incorporates changes required to support multiple "versions" of files as required by Level 1 processing. Currently L1 uses an arbitrary naming scheme which is not understood by the data catalog or tools like the data skimmer. This results in multiple copies of the same event being skimmed. This release fixes this problem, and also simplified the book keeping required by L1Proc.

The database will be converted to the new structure. Based on the time taken to convert the DEV database, and the number of datasets in PROD, we estimate that it will take about 8 hours to do the conversion on PROD. Database views have been created which will mean tools which care only about the most recent version of a file, and which do not update the database will not see any change/

Test Procedure

These changes have been tested on a DEV version of the server. We have tested ASP, MC and L1 tasks using the new release.

Rollback procedure

This update requires changes to the data catalog database. They can be rolled back if a problem is found immediately, but rolling back after a substantial number of datasets have been registered in the new schema will be difficult.

Related JIRA

SSC-51@JIRA

Details

DataCatalog 2.0

Pipeline 1.2

This release also incorporates the performance improvements made in release 1.1.1

and minor changes to the job control daemon documented separately.

Data Crawler 1.3

Posted at 27 May @ 4:04 PM by Tony Johnson | 0 Comments
  2008/05/30
CCB Request to release Pipeline II v1.2.1, Data Catalog v2.0.1
Last changed: May 30, 2008 20:20 by Daniel Flath
Labels: sassoccb

Reason for Change:

Database connections are being left open when creating DataCatClient objects. The open connections are exhausting the supply of sockets on the server machine.

Test Procedure:

Install to TEST, DEV, have Maria Elana test L1Proc there to prove that the connections are being closed correctly.

Rollback Procedure:

Just restart server using org-glast-pipeline-server-1.2.jar

Related JIRA

SSC-57@JIRA

Details:

Posted at 30 May @ 8:06 PM by Daniel Flath | 0 Comments

May 2008
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
             

May 15, 2008
May 27, 2008