Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Cannot find chicr5 or chiccr5 for tld=ZA(Africa) in response to traceroute  -m 30 -q 1 -w 1 www.dut.ac.za

    Cannot find chicr5 or chiccr5 for tld=ZA(Africa) in response to traceroute  -m 30 -q 1 -w 1 www.museumsnc.co.za

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 www.ru.ac.za for tld=ZA(Africa)

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 brunsvigia.tenet.ac.za for tld=ZA(Africa)

    Cannot find chicr5 or chiccr5 for tld=AL(Balkans) in response to traceroute  -m 30 -q 1 -w 1 www.geo.edu.al

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 www.upt.al for tld=AL(Balkans)

    Cannot find chicr5 or chiccr5 for tld=RO(Balkans) in response to traceroute  -m 30 -q 1 -w 1 speed.alienstation.ro

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 ns1.credis.ro for tld=RO(Balkans)

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 www.fabboya.az for tld=AZ(Central_Asia)

    Cannot find chicr5 or chiccr5 for tld=AZ(Central Asia) in response to traceroute  -m 30 -q 1 -w 1 speedtest.ivory.azstarnet.az

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 www.democrats.ge for tld=GE(Central_Asia)

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 www.tdasu.edu.ge for tld=GE(Central_Asia)

    Cannot find chicr5 or chiccr5 for tld=GE(Central Asia) in response to traceroute  -m 30 -q 1 -w 1 www.gdi.ge

    Cannot find chicr5 or chiccr5 for tld=GE(ntld/Central_Asia)=3 in response to traceroute  -m 30 -q 1 -w 1 www.koda.ge

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 tsu.ge for tld=GE(Central_Asia)

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 www.gsi.de for tld=DE(Europe)

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 www.physik.rwth-aachen.de for tld=DE(Europe)

    Cannot find chicr5 or chiccr5 for tld=DE(Europe) in response to traceroute  -m 30 -q 1 -w 1 www.kph.uni-mainz.de

    Cannot find chicr5 or chiccr5 for tld=GI(Europe) in response to traceroute  -m 30 -q 1 -w 1 www.attiaslevy.gi

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 www.fsc.gi for tld=GI(Europe)=21

    Cannot find chicr5 or chiccr5 for tld=GI(Europe) in response to traceroute  -m 30 -q 1 -w 1 www.gibmuseum.gi

    Cannot find chicr5 or chiccr5 for tld=LU(Europe) in response to traceroute  -m 30 -q 1 -w 1 www.cssf.lu

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 ns1.restena.lu for tld=LU(Europe)

    Cannot find chicr5 or chiccr5 for tld=LU(Europe) in response to traceroute  -m 30 -q 1 -w 1 www.rtl.lu

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 www.nikhef.nl for tld=NL(Europe)

    Cannot find chicr5 or chiccr5 for tld=NL(Europe) in response to traceroute  -m 30 -q 1 -w 1 www.routenet.nl

    Found chicr5 or chiccr5 in traceroute  -m 30 -q 1 -w 1 edepot.wur.nl for tld=NL(Europe)

Partial explanation from Dale W. Carder <dwcarder@es.net>

1) You are definitely seeing the effect of ECMP paths through ESnet
particularly from Chicago to Washington.  In ESnet's IGP, the following
paths are equal cost and through traffic is spread over them:
- chic-cr5 -- wash-cr5
- chic-cr5 -- eqx-chi-cr5 -- eqx-ash-cr5 -- wash-cr5

2) I think the timing of this outage also coincided with transit
upgrades underway at Sunnyvale.  Traffic was migrated off a congested
path to AS3356 until a new 10G was put in place. 
3) Additionally, I know we recently updated route filters for Geant, and
there was also a long duration outage at CSTNET.  That could have had
some aberrations on international connectivity.

So, Hopefully this helps fill in some blanks in your notes.  I would be
interested in any other international paths where it looks like there
may be sub-optimal route selection.  Often times these are to some
degree manually curated to prefer faster links and at other times the L3
topology doesn't tell the whole story of what is happening at L2.

Signature

The easiest signature to detect the effect of such an incident is to simply look at the number of targets responding each hour and look for a sudden drop. In this case we saw at UTC hour 14:00-1500 the number of hosts in Europe responding to pings from SLAC dropped from 97 or 98 for the rest of the day to 37 i.e. the number of responding hosts dropped by a dramatic 42%.

...