Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • To review reflector.log-disabled on pinger@slac.stanford.edu you may want to use

    Code Block
    >grep transmitted /tmp/reflector.log-disabled
    Landmark(2)=http://206.117.37.4:3355, Client=134.79.104.80, target=134.79.18.188,\ 
    ability=0, 10 packets transmitted, 10 received, 0% packet loss, rtt min/avg/max = 9.485/9.6527/10.007<br>
    Landmark(2)=http://pinger.cern.ch/cgi-bin/traceroute.pl?target=134.79.18.188&function=ping, Client=134.79.104.80, ability=0,\
     5 packets transmitted, 5 received, 0% packet loss, rtt min/avg/max = 171.147/171.627/172.808<br>
    Landmark(2)=http://204.178.4.164:3355, Client=134.79.104.80, target=134.79.18.188,  ability=0, 10 packets transmitted,\
     10 received, 0% packet loss, rtt min/avg/max = 86.906/106.1068/154.922<br>
    
  • To review reflector.log-enabled on pinger@slac/stanford.edu you may want to use

    Code Block
    >grep failed /tmp/reflector.log-enabled
    Landmark(1)=http://192.42.83.252:3355, Client=134.79.104.80, target=134.79.18.188,  ability=1,\
     failed to connect response code 200 <br>
    Landmark(1)=http://138.238.250.157:3355, Client=134.79.104.80, target=134.79.18.188,  ability=1,\
     failed to connect response code 200 <br>
    
    >grep transmitted /tmp/reflector.log-enabled
    Landmark(2)=http://pinger-ncp.ncp.edu.pk/cgi-bin/traceroute.pl?target=134.79.18.188&function=ping,\
     Client=134.79.104.80,  ability=1, 5 packets transmitted, 5 received, 0% packet loss,\
     rtt min/avg/max = 316.461/316.685/316.896<br>
    

tulip-

...

tuning2.pl

Using the the Tulip log analysis script's results for the last 1 day for enabled landmarks and results for the last 1 days for disabled landmarks we can identify the hosts and their success percentages. We opted to disable all the enabled hosts that were having success less than 20%, and to enable the disabled ones with success rate greater than 20%35%.

The tulip-tuningtuning2.pl script is in /afs/slac.stanford.edu/package/pinger/tulip. It uses the perl LWP package to call reflector.cgi?function=analyze&days=1&ability=[1|0] to access the Tulip analyzed log data for the last 1 days, downloads the analyzed tulip log file by requesting it from the reflector using http://www-wanmon.slac.stanford.edu/cgi-wrap/reflector.cgi?function=analyze&days=3 and saves it in a file and then parses the output to get: for option ability=1, the faulty landmarks (i.e. the enabled ones with below 20% success rate by default) that are updated in the Tulip database to disable them; or option ability=0 the disabled landmarks with a success rate greater than 20% that are then re-enabled in the database.

It (tulip-tuningtuning2.pl) is run twice nightly (see the trscrontab) once to disable non working landmarks (those enabled landmarks that have fallen below 20% success), once to enable landmarks that are now working again (those disabled landmarks that now have success above 50%35%).  

Code Block
/afs/slac/package/pinger/tulip/tulip-tuningtuning2.pl -d 1 -a disabled

It (tulip-tuningtuning2.pl) must be run on a host in the 134.79/16 address space (i.e. a machine on the SLAC 134.79 address space) and is run before the sites.xml or sites-disabled.xml are created by http://www-dev.slac.stanford.edu/cgi-wrap/scriptdoc.pl?name=create_sites-xml.pl.

...

You can run 

Code Block
tulip-tuningtuning2.pl -d 1 -a disabled --debug 0

from the command line to see how it matches the landmarks in the TULIP database with the log to find ones above the threshold and enable them.

tier0-tuning.pl

After Vtrace, it was observed that all working tier0 landmarks were being disabled by tulip-tuning2.pl, this was due to routers that don't respond to pings. According to reflector logs (tulip-log-analyze.pl) these landmarks appeared to be down even though in reality the targets were causing pings to fail.

To overcome this issue tier0-tuning.pl was written which uses only slac.stanford.edu as the target for deciding which tier0 landmarks should be enabled/disabled. Instead of relying on tulip log, this script calls reflector directly and parses the output for decision making. This rules out targets as the cause of ping failures.

The script is at:

/afs/slac/package/pinger/tulip/tier0-tuning.pl

tulip-dup.pl

There are several cases where we have more than one landmark at the same geographic location. Having more than one active landmark at any location just results in additional geolocation time without improving accuracy. This script finds all the enabled landmarks that have the same geographic location and disables all except one. This script runs every night around 2am from trscrontab.

...

Code Block
/afs/slac/package/pinger/tulip/tulip-dup.pl

After running

...

the above scripts

To generate the sites xml files it is necessary to run:

...