Q: Where are the source programs for PingERLOD?

A: You will find them in:

/afs/slac/package/pinger/lod/PingERLOD/src

The trscrontab is in pinger@pinger.slac.stanford.edu and the calls are to shell scripts that in turn call the programs.

The log files are at:

/afs/slac/package/pinger/lod/PingERLOD/data/log/

Q: How does one add information to the "Click for Help" button.

A: Edit : /afs/slac.stanford.edu/package/pinger/lod/PingERLODServer/js/schools_results.js, and change the function getHelpMessage.

Q: How does one add information to what is displayed in the window when one clicks on a node?

A: Edit : /afs/slac.stanford.edu/package/pinger/lod/PingERLODServer/js/schools_results.js, and change the function attachMessage. 

Q: I got error message from LSF for /afs/slac.stanford.edu/package/pinger/lod/PingERLOD/cmd/linux/schools.sh. Message said:

7/8/2013 11:22:15 -- PID=142350036 -- ERROR: Tried to start SchoolInstantiator but one or more of the EndPoints needed are taking too long to answer. Check log files for more details.

A: This means that the web service is unable to reach the remote data sets. These include dbpedia, geonames, factforge. Hopefully the next time it runs it will be successful. Continued failures need investigating. It runs once a day. The log files are at:

 /afs/slac/package/pinger/lod/PingERLOD/data/log/

look at the the most recent file of the form logn.txt. Vim the file and search on the PID (e.g. 14235006) and look at the rror messages, e.g.

7/8/2013 11:6:30 -- PID=142350036 -- DBPedia does not seem to be up. -- HttpException: -404 Failed to connect to remote server

Q: I get the error message from the batch trscron job

lnxcron 00 11 *  * * /usr/local/bin/bsub -q medium /afs/slac.stanford.edu/package/pinger/lod/PingERLOD/cmd/linux/schools.sh # <100mins on dole, non interfering

The error message sent by email is:

ERROR:  The output file name you specified

relative to the submit-time current working directory of
     /u/sf/pinger/
in the bsub -o option is incorrect or not useable for output.
This job was running on hequ0098 and attempted to write the data to that location but failed.

To prevent losing the output, an attempt has been made to temporarily store it in
     /nfs/farm/knackery/pinger.job.123084.output.

Failure to write to the specified output file location would typically be caused by the following kinds of things:
  o The output file location is out of space or over the user's quota;
  o The user does not have write privileges to the file;
  o The directory specified for the file does not exist;
  o The output file specification did not specify a full path and the
    current working directory at the time of job submission was not
    what was intended.
  o Some system failure occured such that the output file could not
    be opened or mounted.

The file shown above should be readable from any interactive server.
Please erase it after you have looked at it.
The file will be erased automatically in 72 hours.

_A: There are several possibilities: file server problem (e.g. one of the Surreys NFS file servers), token may have expired, pinger disk space quota may be full, the LSF spool may be full. First check pinger file space quota utilization (fs lq), then ensure that trscrontab has a long enough token time. If the above are OK, then notify unix-admin with all the information.

The file mentioned in the diagnostic above is:

Job was executed on host(s) <hequ0098>, in queue <short>, as user <pinger> in cluster <slac>.
</u/sf/pinger> was used as the home directory.
</u/sf/pinger> was used as the working directory.
Started at Wed Aug 14 08:00:09 2013
Results reported at Wed Aug 14 08:00:46 2013

Your job looked like:

------------------------------------------------------------
# LSBATCH: User input
------------------------------------------------------------

Successfully completed.

Resource usage summary:

    CPU time :               1.57 sec.
    Max Memory :             46 MB
    Average Memory :         24.50 MB
    Total Requested Memory : -
    Delta Memory :           -
    (Delta: the difference between total requested memory and actual max usage.)
    Max Swap :               1370 MB

    Max Processes :          4
    Max Threads :            22

PS:

Unable to read output data from the stdout buffer file </nfs/farm/lsb_spool/1376492402.123084.out>: your job was probably aborted prematurely.

Since the job ran for only 46 seconds it does not appear to be a token problem.

I do not think it is a space problem or a privileges problem:

118cottrell@hequ0098:~pinger$fs la
Access list for . is
Normal rights:
  system:slac rl
  system:administrators rlidwka
  system:authuser rl
  pinger rlidwka
119cottrell@hequ0098:~pinger$fs lq
Volume Name                    Quota       Used %Used   Partition
u.pinger                     2000000     629541   31%         11%

123cottrell@hequ0098:~pinger$ls -ld /nfs/farm/lsb_spool/
drwxrwxrwt 4 lsf lsf 21233664 Aug 14 12:06 /nfs/farm/lsb_spool//

Q: Below is another one at a later time and its not a token issue
The trscron job runs on pinger@pinger.slac.stanford.edu. The command is:

lnxcron;30 00 13 *  * * /usr/local/bin/bsub -q short /afs/slac.stanford.edu/package/pinger/lod/PingERLOD/cmd/linux/nodes.sh #  <12 mins non interfering

The email giving the error is below:

ERROR:  The output file name you specified

relative to the submit-time current working directory of
/u/sf/pinger/
in the bsub \-o option is incorrect or not useable for output.
This job was running on kiso0054 and attempted to write the data to that location but failed.

To prevent losing the output, an attempt has been made to temporarily store it in
/nfs/farm/knackery/pinger.job.173446.output.

Failure to write to the specified output file location would typically be caused by the following kinds of things:
o The output file location is out of space or over the user's quota;
o The user does not have write privileges to the file;
o The directory specified for the file does not exist;
o The output file specification did not specify a full path and the
current working directory at the time of job submission was not
what was intended.
o Some system failure occured such that the output file could not
be opened or mounted.

The file shown above should be readable from any interactive server.
Please erase it after you have looked at it.
The file will be erased automatically in 72 hours.
\----------------------------------------------------------------------------\-

A: I am researching this one.
Here is the contents of /nfs/farm/knackery/pinger.job.173446.output

Job </afs/slac.stanford.edu/package/pinger/lod/PingERLOD/cmd/linux/nodes.sh> was submitted from host <lnxcron> by user <pinger> in cluster <slac>.
Job was executed on host(s) <kiso0054>, in queue <short>, as user <pinger> in cluster <slac>.
</u/sf/pinger> was used as the home directory.
</u/sf/pinger> was used as the working directory.
Started at Wed Aug 14 13:00:08 2013
Results reported at Wed Aug 14 13:07:07 2013
Cannot open your job file: /u/sf/pinger/.lsbatch/1376510402.173446
Successfully completed.

Resource usage summary:

CPU time :               9.39 sec.
Max Memory :             56 MB
Average Memory :         45.92 MB
Total Requested Memory : -
Delta Memory :           -
(Delta: the difference between total requested memory and actual max usage.)
Max Swap :               1382 MB

Max Processes :          4
Max Threads :            34


PS:

Unable to read output data from the stdout buffer file </u/sf/pinger/.lsbatch/1376510402.173446.out> your job was probably aborted prematurely.

I do not think it is a space or priviledge problem:

128cottrell@kiso0054:~/pinger$fs la
Access list for . is
Normal rights:
system:slac rl
system:administrators rlidwka
system:authuser rl
cottrell rlidwka
129cottrell@kiso0054:~/pinger$fs lq
Volume Name                    Quota       Used %Used   Partition
u.cottrell                   1000000     894217   89%         11%
130cottrell@kiso0054:~/pinger$ls \-ld /nfs/farm/lsb_spool
lrwxrwxrwx 1 root root 31 Aug 12 09:14 /nfs/farm/lsb_spool \-> /a/surrey04a/vol/vol1/lsb_spool/
 

The job only ran for 7 mins and the default token for trscrontab is 15 mins. So it does not seem to be a tken problem.

*Q: When I am running a Sparql Query in pingerlod.slac.stanford.edu/sparql, I got an error Socket Close. What is it?
_A: You are probably running more than one query at the same time and the server has gotten too busy. Wait a little and try again.

  • No labels