Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

You can just immediately run a command on any batch host with lsrun. Obviously this should be used sparingly. uptime is a good cheap thing that will show you if the host is responsive and what the load is. Or check to see if a particular filesystem is causing trouble, or if some command works.

 

Failure to execute java script

(Need to update this title to exact phrase.  The system doesn't keep old messages and I forgot to write it down)

When a script gets terminated and viewing messages includes the error message "failure when executing java...(Need to get exact text next time it happens)".  This is typically an indication of a bad variable in the environment/database that the process is running it.  This bad value is typically set by the upstream process having run in some strange manner and mangling the variable.  In every case so far that I've seen, the upstream process ran twice simultaneously on two different hosts and so the two processes were overwriting each other.  This is fairly easy to identify as the output log contains all the execution output (except possibly the environment output) twice and there are two different LSF summary blocks at the end of the log.

The solution is to simply roll back the upstream process so it gets a clean execution.