You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

http://www-glast.stanford.edu/protected/mail/opsprob/

Friday April 18

Area

Problem

Comments

Resolution

Monitoring

Problem loading alarm files : javax.servlet.jsp.el.ELException: An error
occurred while evaluating function "dmapp:getAlarms"

?

?

Crawler

Slow crawl? No data files for stress test runs.

Crawler had been reporting all files as missing all day. All files were recrawled and then appeared to be OK.

?

Monitoring

Monitoring data was being sent to DEV

Application was poked and then seemed to work

?

L1

Recon EOR timed out in 80418003.257755648.15606093 with
/afs/slac/g/glast/isoc/flightOps/rhel3_gcc32/ISOC_PROD/bin/isoc: line 33:
/afs/slac/g/glast/isoc/flightOps/volumes/vol5/isoc_rpm/rhel3_gcc32_install_20080227/etc/rpmenv.sh:
Connection timed out Are we overwhelming isoc/flightOps afs?

?

?

Monitoring

DQM stuck

?

?

Pipeline

Mail backlog

Processing of e-mail from L1Proc is slow. This seems to be caused by contention for locking the top level stream (a problem not seen in the much simpler MC task we used for testing)

?

Saturday April 19

Note At 16:15 Dan installed a new version of the pipeline II stored procedures on DEV.

Area

Problem

Comments

Resolution

Oracle

GLASTTREND space full

 

The space was expanded

LSF

Only 100-200 jobs running, when 500+ in queue.

LSF reported 467 jobs in "RSV" status. Neal reports that this is a problem that they have seen before and are investigating with Platform. He requests we contact him if we see it again, but it has not reoccurred since 14:20 on Saturday

?

Xrootd

Xrootd slow

Pipeline

Some DEV jobs failing in strange way

A race condition was discovered where the mail message from the batch job could be received before the stream had been transitioned to "QUEUED" state.

Work around installed in DEV PII-319@JIRA

Sunday April 20

Area

Problem

Comments

Resolution

Oracle

GLASTTREND space full again

?

Ian added 32 GB of space and changed the critical threshold to 90%

Pipeline

2 Stream on DEV are waiting, even though all their PIs are finished

Dan is investigating, probably a result of the patch he put into DEV on Saturyday

?

Monday April 21

Area

Problem

Comments

Resolution

Monitoring

When trying to go to the DQM page (from Glast ground) it just hangs.

The application itself was running when accessed from glast-tomcat07, but probe was hung. tomcat07 was restarted.

?

  • No labels