Purpose
The main purpose of TULIP Central Reflector is to proxy the TULIP queries to PlanetLab's Scriptroute Service. It may also be extended to issue all queries. This decision will be related to speed of execution and security among other things. The PlanetLab Scriptroute service provides a cookie which works for a single IP address only. So in this way all the requests will be issued from the Central reflector and the responses will be sent back to the TULIP JNLP Client. Here is a map of Planet Lab servers.
Implementation
The TULIP Central reflector will be a CGI Script (reflector.cgi) deployed at SLAC. The TULIP client will issue a single request and the Reflector will go ahead and probe all the landmarks in that region*[1] and return the results to the TULIP Client. Probing the target site from more vantage points may give us a better estimate of its location.
Requirements
- Should fetch sites.txt or have a local copy of sites.txt, what changes should be made to sites.txt ?
- A new parameter should be added to sites.txt to include teir0 or teir1. Also the region of tier1 sites needs to be specified in sites.txt
- A separate thread should be used for each landmark and Semaphores should be used for locking, so that data from different threads should not inter-mix.
- There should be a limit on the number of threads that can be launched at a time (say 10).
- Should there be extra logging on the reflector or can we rely on the standard web logs which will log each query including time stamp, the client name. What else it logs depends on whether the rewuest is Get or a Post.
- Where are the results parsed, could be in the reflector or in the Java client. In the client distributes the parsing load, reduces the load on the reflector, simplifies the CGI script.
- What should happen if a landmark responds with a bad data. ( Should it process the error or send the raw data back?). Since there will be some anomalies I suspect the reflector will need to return the full response and anyhow needs to inform the user, so I suspect initially the clinet will process the response and spot errors etc.) Also if the client parses the result it will probably be easily able to spot problems.
- Special consideration for security as the script ultimately has to be deployed at SLAC ( Perl taint option, warning option, special open method etc)
- Need to agree on a common format for the exchange of data.
- Needs a blacklisting mechanism for malicious hosts.
After discussing with Yee and Booker it was seen that forks may be too complicated. The version of Perl at SLAC did not support threading. Also the security people will not allow forks running inside a CGI-script. So I had to come up with an alternative. The solution to this problem was to use Asynchronous IO. A bunch of requests could be send to the landmarks without waiting for he response. The LWP::Parallel library provides all this functionality. It supports asynchronous IO. Currently it is not installed so I am using a local version in my home directory. Ultimately this module has to be installed on the production server.
I have implemented most of the functionality. The script is running fine. I will have to taken measures to make the script more secure so that it could not be used as a platform to launch DDOS attacks, by limiting the number of concurrent process of reflector.cgi to 10. Also the script produces customized messages such as ( request time out or Connection failed so that TULIP client can differentiate between the various kind of error conditions). Also there is a blacklisting mechanism so that particular IP addresses can be blocked.
Implementation
There are two scripts: reflector.cgi and EventHandler.pm. Both have -T (Tainting), warning (-w), use strict, use the 3 parameter version of open, all opens and closes have a die or its equivalent. EventHandler.pm is called by reflector.cgi. The Scripts are deployed in the path /afs/slac.stanford.edu/g/www/cgi-wrap-bin/net/shahryar/smokeping/.
Invocation
The reflector script is called by a URL of the form:
http://www-wanmon.slac.stanford.edu/cgi-wrap/reflector.cgi?region=northamerica&target=134.79.16.9&tier=0&type=planetlabs
Leaving out the region will assume all regions, leaving out the tier will assume all tiers, leaving out the type will assume both PlanetLabs and SLAC type landmarks. If the region is included then only landmarks in that region will be used. If the tier is specified then only that tier's landmarks will be used. If the type is specified then only that type landmark will be used. Any or all of the tier, region and type region may be specified as "all".
The script uses asynchronous I/O to talk simultaneously with up to 20 landmarks. Up to 5 copies of reflector.cgi can be running simultaneously.
For the Planetlab landmarks an interpretive script (also see here for the original is supplied with the target ($target) and the number of pings ($ping) to make. For the SLAC ping servers and the Looking glass sites the landmark is accessed by a URL provided in the sites.txt file (see below) in the PingSites token (e.g. http://www.slac.stanford.edu/cgi-wrap/nph-traceroute.pl?choice=yes&function=trace&target=$target).
Files
The list of SLAC and Looking Glass landmarks is read from /afs/slac/www/comp/net/wan-mon/tulip/sites.txt. the format is space separated tokens:
SNo Site_name PingSite TraceSite Lat Long Reference_Site Alpha, for example (where \ means line broken for viewing)
1 SLAC,Stanford_US http://www.slac.stanford.edu/cgi-wrap/traceroute.pl?function=ping&target= \ http://www.slac.stanford.edu/cgi-wrap/nph-traceroute.pl?choice=yes&function=trace&target= \ 39.32 -122.04 www.slac.stanford.edu 73
The list of PlanetLab landmarks is read from: /afs/slac.stanford.edu/www/comp/net/wan-mon/tulip/TULIP/newsites.txt. It appears as:
Piscataway_UnitedStates_PL 128.6.192.158 40.5516 -74.4637 orbpl1.rutgers.edu northamerica Aachen_Germany_PL 137.226.138.154 50.7708 6.1053 freedom.informatik.RWTH-Aachen.DE europe Winnipeg_Canada_PL 198.163.152.230 49.8833 -97.1668 planetlab2.win.trlabs.ca northamerica 0
Anything following a # sign is ignored (it is a comment).
The tokens are space delimited, they are:
Location: in the form City_Country_Type (the only Type currently is PL=PlanetLab)
IP Address
Latitude
Longitude
IP Name
Region (possible regions with PlanetLab hosts are northamerica, eastasia, europe)
Tier: currently may be 0 or 1, if not provided tier 1 is implied
Next version
This is still in design/discussion. We want to remove, where possible, the overloading of tokens, make it more amenable to adding new tokens, and tie in better to the pingER NodeDetails database. Our first stab at the template for the new XML file can be found here. An initial version will be created from the original PlanteLabs and SLAC sites.txt files at:
my @files=("/afs/slac.stanford.edu/www/comp/net/wan-mon/tulip/newsites.txt",#PlanetLabs "/afs/slac.stanford.edu/www/comp/net/wan-mon/tulip/sites.txt", #SLAC
We will also take this opportunity to clean up the data. For example make sure the City and country appear, thta elements are in the right order etc.
The intent is that the tier and alpha values will be added to the NodeDetails databse later and this will then become the authoritative source from which the XML will be created.
It is unclear whether we need to add alpha or whether this should be determined by the program itself, so we need a value to decide whether to leave it up to the program.
When we have the new XML file then we can modify reflector.cgi using the perl LibXML module to read in the XML file. the second step will be to upgrade the NodeDetails database to use provide the new elements.
Conventions
We use the following conventions:
- Country names are defined by the Mapland database since this is used to produce our maps and we cannot modify it. Usually (but not always) it is in agreement with UN standards.
- The country names (and regions) in the PingER database can be found here.
- Regions are defined as given in the PingER NodeDetails Oracle database.
Deployment of Landmarks
There are about 60 SLAC/Looking Glass landmarks, and about 156 PlanetLabs landmarks. We are working on filtering the latter to remove non- or poorly respodning landmarks. The PlanetLab landmarks send 10 pings very quickly, whereas the SLAC/Looking Glass landmarks send five 56 byte pings with one second between them, they will also wait for a deadline time of 30 seconds for pings to be replied to.
You can see a Google map of the PlanetLab and SLAC/Looking Glass landmarks by clicking below or by going to: http://www.slac.stanford.edu/comp/net/wan-mon/viper/tulip_googlemap.htm
Tiering
To reduce the network impact and reduce the initial rough estimate time, we also break the landmarks into two tiers. Tier0 landmarks are used to identify the region for the target. Then tier1 hosts for that region can be used to more exactly locate the target. Tier0 hosts are chosen as being at the edges of the region, being well connected, highly reliable and quick to respond. We currently only define tier0 hosts for North America and Europe. In other regions all the landmarks are regarded as tier0. There are about 8 tier0 hosts for North America and 4 for Europe. This reduces the number of landmarks to make measurements from with a tier0 request since there are over 100 landmarks in either North America or Europe.
Responses
The responses appear as:
Landmark=http://128.6.192.158, Client=134.79.117.29, failed to connect response code 500
Landmark=http://141.149.218.208, Client=134.79.117.29, 10 packets transmitted, 0 received, 100% packet loss, rtt min/avg/max = 0/0/0
Landmark=http://128.193.33.7, Client=134.79.117.29, 10 packets transmitted, 10 received, 0% packet loss, rtt min/avg/max = 29.178/29.2495/29.316
Landmark=http://pinger.fnal.gov/cgi-pub/traceroute.pl?function=ping&target=134.79.16.9, Client=134.79.117.57, 5 packets transmitted,\
5 received, 0% packet loss, rtt min/avg/max = 52/52/53
The first 3 response are from PlanetLab landmarks and the latter is from a SLAC type landmark.
Errors Reported by PlanetLab
Failed to connect to http://129.22.150.90 response code 500
ERROR: you're (134.79.18.134) already running a measurement on socket 14. http://128.83.122.179
10 packets transmitted, 0 received, 100% packet loss, time 0 ms rtt min/avg/max = 0/0/0 http://141.149.218.208
Can't resolve DNS: submitted:6:in `ip_dst=': unable to resolve $target: running in a chroot without dns support (RuntimeError)
submitted:9: warning: didn't see packet 5 leave: pcap overloaded or server bound to incorrect interface?
To 134.79.16.9 timed out
Error connecting: Connection refused
ERROR: you need a valid scriptroute authentication cookie to use this server, or the cookie you used does not match your client IP 134.79.18.163; go to http://www.scriptroute.org/cookies.html to get one.
ERROR: you're (134.79.18.134) already running a measurement on socket 10.
PlanetLab Server Error: Received: IP (tos 0xc0, ttl 253, id 51592, offset 0, flags [none], length: 56)
192.70.187.218 > 198.82.160.220: icmp 36: time exceeded in-transit
Error connecting: No buffer space available
submitted:9:in `send_train': scriptrouted error: unable to send to 137.138.137.177: No buffer space available (ScriptrouteError)
Logging
In addition to the normal web server (Apache) logging, we use Log4perl for logging. The configuration file is very simple. The following types of error messages can be found in the log file.
2007/09/03 20:02:25 ERROR> EventHandler.pm:70 EventHandler::on_failure - Landmark=http://128.6.192.158, Client=134.79.117.29, failed to connect response code 500<BR>
2007/09/03 20:02:34 ERROR> EventHandler.pm:142 EventHandler::parseData - Landmark=http://129.22.150.90, Client=134.79.117.29, 10 packets transmitted, 0 received, 100% packet loss, rtt min/avg/max = 0/0/0:
2007/09/03 20:09:09 ERROR> EventHandler.pm:115 EventHandler::parseData - Landmark=http://128.143.137.250, Client=134.79.117.29, request timed out: To 134.79.16.9 timed out
Plus Unusual PlanetLab errors of the form:
2007/09/03 20:02:58 ERROR> EventHandler.pm:125 EventHandler::parseData - Landmark=http://128.4.36.11, Client=134.79.117.29, <planetLab error message, see section "Errors Reported by PlanetLab">
There is a script at ~cottrell/bin/tulip-analyze-log.pl to analyse the logs. Typical output appears as:
28cottrell@wanmon:~>bin/tulip-log-analyze.pl ===============Failure types by landmark ======================= Landmark, Success, 100%_loss,connect_fail, not_sent, timeout, refused, in_use, no_name, transit_exc, Totals, 143.225.229.236, 100.0%, 0.0%, 0.0%, 0.0%, 0.0%, 0.0%, 0.0%, 0.0%, 0.0% 1, 149.48.230.20, 40.0%, 37.8%, 2.2%, 0.0%, 11.1%, 0.0%, 0.0%, 8.9%, 0.0% 45,... itchy.cs.uga.edu_PL, 0.0%, 15.4%, 84.6%, 0.0%, 0.0%, 0.0%, 0.0%, 0.0%, 0.0% 26, Landmark, Success, 100%_loss,connect_fail, not_sent, timeout, refused, in_use, no_name, transit_exc, Totals, Totals, 2258, 422, 457, 11, 401, 0, 52, 287, 0, 3888 Wed Oct 3 14:58:38 2007 tulip-log-analyze.pl: took 38 seconds to analyze 4378 records for 323 requests. Successful hosts=111, Failing hosts=108, PlanetLabs=128(100% success=26), SLACs=16(100% success=10)
As we review the logs we will determine whether probing from some landmarks is reliable enough to warrant their use.
Landmark Failures
The typical failure mechanisms for the target www.cern.ch with a timeout of 2 and 10 seconds made in the evening (PDT) of September 8th 2007 is seen in the table below. The multiple numbers in each cell are for different requests. It is seen that increasing the timeout from 2 to 10 seconds does not provide much, if any help. So we utilize a timeout of 2 seconds.
Timeout |
2 secs |
10 secs |
---|---|---|
100% loss |
7, 10, 9 |
10, 11, 8 |
Success |
22, 16, 17 |
20, 14, 21 |
Fail to connect |
10, 8, 9 |
8, 6, 5 |
Timeout |
45, 50, 49 |
46, 51, 46 |
Performance
Some spot measures of performance indicate that for 10 pings per target and 86 PlanetLab landmarks for region=northamerica as we vary the number of landmarks accessed simultaneously, the number of parallel requests per landmark, and the timeout for each request the duration is as follows (n.b. there is a timeout of 100 seconds on the complete process, and the default values are in boldface in the table below:
Simultaneous landmarks |
Parallel requests / landmark |
Request timeout |
Duration (secs) |
---|---|---|---|
20 |
5 |
2 |
50 |
20 |
5 |
10 |
60 |
10 |
5 |
2 |
88 |
40 |
5 |
2 |
34 |
20 |
10 |
2 |
50 |
Testing
It can be tested from a web browser by entering the URL (e.g. from a browser or from wget), e.g.
http://www-wanmon.slac.stanford.edu/cgi-wrap/reflector.cgi?region=northamerica&target=134.79.16.9
It can also be called from the command line, e.g.
>setenv REMOTE_ADDR 134.79.18.134; perl -d -T bin/reflector.cgi "region=northamerica&target=134.79.16.9&tier=0&type=slac"
However unless you have a PlanetLab cookie for your host it will not fully work.
Some hosts mis-identified bt Geo IP tools and VisualRoute include: www.cst.edu.ve
Security
There are several issues related to security.
Landmark Server
The SLAC traceroute/landmark server that is frequently used by landmarks servers: rejects attempts to traceroute to a broadcast address; does not allow a remote host name to be greater than 255 characters to prevent buffer overflow attempts; does not allow a remote host in a different domain to do a traceroute to a host within the same domain as the web server; limits the maximum number of traceroute processes running in the server to reduce the chance of a denial of service request; starts the traceroute after 3 hops if the client/browser and server are in different domains in order to hide internal routing information from outsiders; has a blacklist of sites that are blocked.
Tulip Client
TULIP only allows one copy of the client to be running on a client host. TULIP also hides the URLs used for the landmarks to reduce the possibility of people gleaning the URLs for a denial of service attack. Editing the landmark URL's requires a password known only to the developers.
Log
There is a centralized log with time stamped records of all requests, the requesting host, and the target. This is analyzed for abusers.
Scanning and Denial of Service
A major concern is that the target is pinged simultaneously from multiple landmarks. This can look like a scan of multiple hosts when the target host responds to the ping requests. It can also look like a denial of service attack, especially for hosts with limited available bandwdth, such as are found in developing countries. We thus limit the number of pings from a landmark to a target to 5.
I doubt the early version triggered the alert. It had < 60 landmarks, of these I am guessing (TULIP is down at the moment) about 10-20 did not work (i.e respond to the request to ping). However recently we added 149 PlanetLab hosts. The net result is that with the current version 39 PlanetLab landmarks answer 100% time, 39 answer sometimes, the rest are either not requested or never answer (as far as I can tell this means they are not pinging, i.e. they are not responding to the request to make the pings). The typical number of PlanetLab hosts trying to ping is about 60 (of these about 10 fail with 0 pings responding).
We are working on two things to reduce the number of landmarks pinging at a time.
- Remove landmarks which are not 100% reliable and whose function is replicated by another landmark (e.g. a nearby working one).
- We are also looking at tiering the landmarks (see tiering to tier the N. American and European hosts). The top tier will enable us to locate the region of the world and then the second tier can be used to find the location in that region. This reduces the number of landmarks used and divides them in time into two or more sets. Most landmarks are in N. America or Europe (136 out of 149 for PlanetLabs & 26 out of 63 for the SLAC type landmarks). So for tier0 landmarks we use 5 sites in North America, 3 in Europe and all 32 sites outside N. America and Europe. The tier0 sites are first requested to provide the area the host is in and a rough estimate of position. Thus there are currently 5+32+3 tier0 landmark requests (thiw will be reduced when we remove unreliable landmarks, see above) The client can then request more detailed information of the host if it is in N. America or Europe.
Other Concerns
We have also considered whether the knowledge that a machine and possibly the usual owner can be accurately located may violate some privacy issue. This may require us to add some fuzz to results. So far this has not been done.
Sample Scripts
traceroute.pl: This script has been written with special security considerations so it will help in implementing reflector.cgi
topology.pm: This is a multi-threaded script written by Yee so this will help understand the threading issues in perl which are a bit complex.