Blog from October, 2010

Request to deploy L1Proc 2.0

Reason for change

Move L1 buffers from AFS to xroot.

Test Procedure

We have processed runs in the DEV pipeline with this version of L1Proc.

Rollback procedure

We can easily switch back to the previous version of L1Proc.

CCB Jira

SSC-267@JIRA

Details

L1Pipeline: L1Pipeline v2r0

  • Buffer chunk files on xroot instead of AFS. One of the 7 servers is still used for handoff from the HalfPipe.
  • doInc and doIncLci subtasks are eliminated. The forceL1Merge task is broken and more likely to be removed than fixed. The functions of all of these can be achieved by getting ACQSUMMARY to say the right thing and rolling back the last checkRun.

Complete set of tags for L1Proc 2.0

GlastRelease (sim/recon): GlastRelease-v15r47p12gr13

  • Disable autosave in digitization

ScienceTools (Level 2) : v9r16p1

svac/L1Pipeline: L1Pipeline v2r0*

calibTkrUtil v2r9p1
calibGenTKR v4r5

dataMonitoring/FastMonCfg: FastMonCfg-02-01-01
dataMonitoring/DigiReconCalMeritCfg: DigiReconCalMeritCfg-01-04-06

dataMonitoring/Common: Common-06-08-00
dataMonitoring/FastMon: FastMon-05-02-00
datMonitoring/IGRF: IGRF-02-01-00

svac/Monitor: Monitor-01-06-01
svac/EngineeringModelRoot: v4r4
svac/TestReport: TestReport-11-00-00

users/richard/pipelineDatasets: v0r6

ft2Util: v1r2p31

evtClassDefs v0r14p0

GPLtools: GPLtools-01-15-01-fo06

Reasons for Change

We need to run Spread in the MSR so that real-time telemetry monitoring tools will work there.

Test Procedure

  • Run all the real-time tools on isoc-ops7, for example.

Rollback Procedure

  • Restore the old Spread configuration that includes only glastlnx06,11.
  • Take down all Spread daemons in the MSR.
  • Bounce the Spread daemons on glastlnx06,11 so that they see the restored configuration.

CCB Jira

ssc-266@JIRA

Changes:

  • First complete ssc-265@JIRA.
  • Change the central Spread configuration file to include the MSR machines (and glastlnx24).
  • Bring down the Spread daemons on glastlnx06,11.
  • Bring up Spread on glastlnx06,11,24 and the MSR machines.

Reasons for Change

The rules currently in place are out of date. In particular they restrict Spread use in the MSR to machines that no longer exist which has caused Spread to eventually lock up if its configuration says those machines should be accessible. Also it would be more robust to express rules in terms of subnets rather then individual machines.

Test Procedure

  • I've installed the new set of rules on my desktop machine; it doesn't crash and remains accesible via SSH.
  • Log in to glastlnx11 and 06 via SSH from outside the SLAC internal net, e.g., from the visitor's net.
  • Ask the MOC and GSSC to send us some trial FASTCopy packages.
  • Run the Telemetry Trending web app with the data source set to XML-RPC.
  • Run the Telemetry Monitor and the Telemetry Table web apps for PROD.
  • Use telnet to try to connect to the FASTCopy, web Trending and web telemetry services from outside of SLAC; these should all be forbidden.

Rollback Procedure

At first we can install the new rules temporarily so that a reboot will reinstate the old rules. In the worst case the machine will go off the network so rebooting will have to be done from the local console. The script that installs the new rules saves the old ones in a restorable format so that in milder cases of malfunction the old rules can be reinstated from an interactive terminal session. Eventually we will make the new rules permanent so that they survive reboots. Note that you need superuser privileges to modify iptables rules.

CCB Jira

ssc-265@JIRA

Changes:

  • Update ISOC.MocTicker.GenFilter, which generates a bash script that
    installs the new rules.
  • Run the bash script on the machine whose rules we wish to change.