You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

After alot of tests we concluded that Planet Lab nodes highly unpredictable in nature in terms of availability. For every access by TULIP we log the response for landmark. With the help of this logging information we generate a list containing of nodes with corresponding success or failure percentages and reasons of their failsures. These percentages are generated by /afs/slac/package/netmon/tulip/tulip-log-analyze.pl  and results can be seen here

 tulip-tuning.pl

Now with the help of this results from analysis script we can very well identify the hosts and thier success percentage. We opted to disable all the hosts which were having success less than 20%. The above mentioned script is in $tulipdir and it performs the listed functions. It use LWP package to access the webpage, download the file and then parse the output to get the faulty landmarks.

  landmark-laundering.pl

After solving the cleaning up process, we landup in another situation and that is weather any of those hosts would come back and if yes how we would know. Should we disable them forever ir should we build up some mechanism to bring them back ? To solve this problem we devised pretty straight forward mechanism i.e. to devise a notification process, which can help us in identifying the landmarks which are up.

This script performs following actions to identify landmarks being up or down.

  1. It pings the host name of the landmark which are disabled
    1. If the ping is successful it marks the landmark being up
  2. Else it checks if IP address corresponding to that host is is pingable.
    1. If IP replies it marks as up and also tells that IP responded
    2. If ping fails it tries to connect at port, we did this step because there are many hosts which blocks ping requests
  3. If all the above mentioned steps fail, we mark the host as a not responding. 

Now to achieve the above mentioned steps we have done modifications in tulip database. We have created a new table named maintenance. The discription of the table is as follows.

 +----------+-------------+------+-----+---------+-------+
| Field    | Type        | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+-------+
| ipv4Addr | varchar(15) |      | PRI |         |       |
| downDays | int(11)     | YES  |     | NULL    |       |
| upDays   | int(11)     | YES  |     | NULL    |       |
| comments | varchar(11) | YES  |     | NULL    |       |
+----------+-------------+------+-----+---------+-------+
  • No labels