Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The problem was reported to trouble@es.net. It was entered as trouble ticket ESNET-20130528-009

Seen from ANL, 8:03am, 5/29/2013

From Phil Reese:

Poking at the other end, I found another perfsonar at ANL collecting throughput traffic back toward the west cost. See anlborder-ps.it.anl.gov and look at the throughput graphs available there.

From those, it seems like there is symmetry to Kansas at least but then it degrades as the route moves further west.

From Stanford, we are primarily an I2 site but it looks like once we get into the CENIC world, there is a blending of routes between I2 and ESnet. Not sure if that is a problem or not.

...

Players

Person

Affiliation

Email address

 

 

 

Michael Sinatra

ESnet

ms@es.net 

 

 

 

Phil Reese

Stanford

preese@stanford.edu

 

 

 

Corey Hall

ANL

chall@anl.gov

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Seen from ANL, 8:03am, 5/29/2013

I have been looking at various perfsonar nodes in an effort to track down the issues that SSRL is experiencing with throughput to ANL.

You're correct to note that the routing between Stanford Campus and ANL is asymmetric. CENIC prefers to hand traffic bound for ANL off to ESnet at the 100G peering at Sunnyvale, while ANL prefers the path through MREN directly to Internet2. In other words, the ANL-->Stanford Campus path never touches ESnet. I can also see from the pS toolkit web interface on the ANL that there are similar issues between anlboder-ps and CENIC pS machines. This suggests to me that there is a more general issue between anlborder-ps and the rest of the world (lack of queue depth on the immediate upstream switch or router is one possibility).
That same issue could be affecting the ANL-->SSRL throughput.

It's a bit harder to see things from the SLAC end. Throughput tests between ESnet's pS boxes at the SLAC ESnet router (slac-pt1) and at the ANL ESnet router (anl-pt1) look really good. You can see the overall picture here:

https://my.es.net/network/performance/bwctl

Things degrade somewhat when we take one step inside the ANL border. I see significantly worse performance (but not horrible on an absolute
scale) between slac-pt1.es.net and anlborder-ps.it.anl.gov than with slac-pt1.es.net and anl-pt1.es.net. I also see better performance in the direction toward anlborder-ps than in the opposite direction, but I see really good performance in BOTH directions between slac-pt1 and anl-pt1.

It's harder to see things on the SLAC side because the only perfsonar host is on the Stanford campus and the outbound routing is different between campus and SLAC. Also, I am noticing the same outbound issues between the Stanford pt host and CENIC pS hosts that we are seeing between Stanford and ESnet hosts. It looks like there may be an outbound issue with the Stanford pS host as well.

So I think we need to take two steps here: One is to try to figure out why there seems to be some outbound throughput issue at ANL (at least with their perfsonar box); the other is to get a perfsonar box (even a temporary toolkit box that we can test with) deployed within SLAC, as close to SSRL as possible. That will give us a chance to test different parts of the (almost) end-to-end path.

Yee responded with the SLAC perfSONAR hosts:

Machines at our cores are

psnr-serv01.slac.stanford.edu
psnr-farm04.slac.stanford.edu
psnr-farm10.slac.stanford.edu

We in the midst of updating some security policies, so I'm not sure if all tests are allowed at the moment.

Response from Stanford, 9:18pm 5/29/2013

From Phil Reese:

Poking at the other end, I found another perfsonar at ANL collecting throughput traffic back toward the west cost. See anlborder-ps.it.anl.gov and look at the throughput graphs available there.

From those, it seems like there is symmetry to Kansas at least but then it degrades as the route moves further west.

From Stanford, we are primarily an I2 site but it looks like once we get into the CENIC world, there is a blending of routes between I2 and ESnet. Not sure if that is a problem or not.

More information from ESnet (Michael, Sinatra), 4:09pm, 5/29/2013

I have been looking at various perfsonar nodes in an effort to track down the issues that SSRL is experiencing with throughput to ANL.

You're correct to note that the routing between Stanford Campus and ANL is asymmetric. CENIC prefers to hand traffic bound for ANL off to ESnet at the 100G peering at Sunnyvale, while ANL prefers the path through MREN directly to Internet2. In other words, the ANL-->Stanford Campus path never touches ESnet. I can also see from the pS toolkit web interface on the ANL that there are similar issues between anlboder-ps and CENIC pS machines. This suggests to me that there is a more general issue between anlborder-ps and the rest of the world (lack of queue depth on the immediate upstream switch or router is one possibility).
That same issue could be affecting the ANL-->SSRL throughput.

It's a bit harder to see things from the SLAC end. Throughput tests between ESnet's pS boxes at the SLAC ESnet router (slac-pt1) and at the ANL ESnet router (anl-pt1) look really good. You can see the overall picture here:

https://my.es.net/network/performance/bwctl

Things degrade somewhat when we take one step inside the ANL border. I see significantly worse performance (but not horrible on an absolute
scale) between slac-pt1.es.net and anlborder-ps.it.anl.gov than with slac-pt1.es.net and anl-pt1.es.net. I also see better performance in the direction toward anlborder-ps than in the opposite direction, but I see really good performance in BOTH directions between slac-pt1 and anl-pt1.

It's harder to see things on the SLAC side because the only perfsonar host is on the Stanford campus and the outbound routing is different between campus and SLAC. Also, I am noticing the same outbound issues between the Stanford pt host and CENIC pS hosts that we are seeing between Stanford and ESnet hosts. It looks like there may be an outbound issue with the Stanford pS host as well.

So I think we need to take two steps here: One is to try to figure out why there seems to be some outbound throughput issue at ANL (at least with their perfsonar box); the other is to get a perfsonar box (even a temporary toolkit box that we can test with) deployed within SLAC, as close to SSRL as possible. That will give us a chance to test different parts of the (almost) end-to-end path.

Yee responded with the SLAC perfSONAR hosts:

Machines at our cores are

psnr-serv01.slac.stanford.edu
psnr-farm04.slac.stanford.edu
psnr-farm10.slac.stanford.edu

We in the midst of updating some security policies, so I'm not sure if all tests are allowed at the moment.

Response from Stanford, 9:18pm 5/29/2013

First a little clarification. The issue is between a researcher on the Stanford net to a service at ANL. I talked with Les to see about a contact on ESnet, since the traceroute I can run show a lot of ESnet hops.

This is the traceroute I get from a Stanford PS (outside our firewall
infrastructure) to ANL's public PS:

Wiki Markup
\[\]$ traceroute ndt.anl.gov traceroute to ndt.anl.gov (146.137.222.101), 30 hops max, 40 byte packets
1  frcfa-rtr-vlan817.Stanford.EDU (171.67.92.18)  0.215 ms  0.175 ms 0.157 ms
2  boundarya-rtr-vlan8.SUNet (171.64.255.210)  0.326 ms  0.394 ms 0.457 ms
3  hpra-rtr.Stanford.EDU (171.66.0.33)  0.234 ms  0.236 ms  0.240 ms
4  hpr-svl-hpr2--stanford-10ge.cenic.net (137.164.27.61)  1.105 ms 1.172 ms  1.233 ms
5  hpr-esnet--svl-hpr2-100ge.cenic.net (137.164.26.10)  1.571 ms 1.647 ms  2.232 ms
6  sacrcr5-ip-a-sunncr5

First a little clarification. The issue is between a researcher on the Stanford net to a service at ANL. I talked with Les to see about a contact on ESnet, since the traceroute I can run show a lot of ESnet hops.

This is the traceroute I get from a Stanford PS (outside our firewall
infrastructure) to ANL's public PS:

Wiki Markup
\[\]$ traceroute ndt.anl.gov traceroute to ndt.anl.gov (146.137.222.101), 30 hops max, 40 byte packets
1  frcfa-rtr-vlan817.Stanford.EDU (171.67.92.18)  0.215 ms  0.175 ms 0.157 ms
2  boundarya-rtr-vlan8.SUNet (171.64.255.210)  0.326 ms  0.394 ms 0.457 ms
3  hpra-rtr.Stanford.EDU (171.66.0.33)  0.234 ms  0.236 ms  0.240 ms
4  hpr-svl-hpr2--stanford-10ge.cenic.net (137.164.27.61)  1.105 ms 1.172 ms  1.233 ms
5  hpr-esnet--svl-hpr2-100ge.cenic.net (137.164.26.10)  1.571 ms 1.647 ms  2.232 ms
6  sacrcr5-ip-a-sunncr5.es.net (134.55.40.5)  4.096 ms  3.827 ms 3.833 ms
7  denvcr5-ip-a-sacrcr5.es.net (134.55.50.202)  25.048 ms  25.113 ms 25.369 ms
8  kanscr5-ip-a-denvcr5.es.net (134.55.49.58)  35.620 ms  35.711 ms 35.961 ms
9  chiccr5-ip-a-kanscr5.es.net (134.55.4340.815)  464.647096 ms  463.729827 ms 463.985833 ms
107  starcr5denvcr5-ip-a-chiccr5sacrcr5.es.net (134.55.4250.42202)  4625.977048 ms  4625.987113 ms 4725.242369 ms
118  anlkanscr5-ip-a-anlcr5denvcr5.es.net (198134.12455.21849.658)  5935.069620 ms  5935.078711 ms 5935.088961 ms
12  * * *
13  * * *

I understand your point about CENIC handing off to ESnet, so transfers from Stanford to ANL are good. If ANL hands traffic right to I2 back in Il, then I can see that path would be different. Thus asymmetry. (have you a way to get a traceroute from ANL to my Stanford PS server, just to document the different routing?)

In talking with contacts at ANL, they report that anlborder-ps.it.anl.gov is actually within a firewall. Here are some details they suggest:

ndt.anl.gov – outside the ANL firewall
anlborder-ps.it.anl.gov – behind the ANL firewall
prfsnr.aps.anl.gov – behind the APS firewall within ANL
perfsonar.gmca.aps.anl.gov – on the GMCA (i.e. our group) subnet

Again your comment about traffic to anlborder-ps.it.anl.gov being slower makes sense.

9  chiccr5-ip-a-kanscr5.es.net (134.55.43.81)  46.647 ms  46.729 ms 46.985 ms
10  starcr5-ip-a-chiccr5.es.net (134.55.42.42)  46.977 ms  46.987 ms 47.242 ms
11  anl-ip-a-anlcr5.es.net (198.124.218.6)  59.069 ms  59.078 ms 59.088 ms
12  * * *
13  * * *

I understand your point about CENIC handing off to ESnet, so transfers from Stanford to ANL are good. If ANL hands traffic right to I2 back in Il, then I can see that path would be different. Thus asymmetry. (have you a way to get a traceroute from ANL to my Stanford PS server, just to document the different routing?)

In talking with contacts at ANL, they report that anlborder-ps.it.anl.gov is actually within a firewall. Here are some details they suggest:

ndt.anl.gov – outside the ANL firewall
What still has me stumped is this throughput graph from anlborder-ps.it.anl.gov to a SLAC PS.
– behind the ANL firewall
prfsnr.aps.anl.gov – behind the APS firewall within ANL
perfsonar.gmca.aps.anl.gov – on the GMCA (i.e. our group) subnet

Again your comment about traffic to http://anlborder-ps.it.anl.gov/serviceTest/bandwidthGraph.cgi?url=http://localhost:8085/perfSONAR_PS/services/pSB&key=f900976535c9051c3d9251e0301335c2&sTime=1367294236&eTime=1369886236&keyR=f6122888ee44db6d8691acc7bc37a2dd&src=slac-pt1.es.net&dst= anlborder-ps.it.anl.gov &srcIP=198.129.254.134&dstIP=130.202.222.58

This appears to be all ESnet but yet is asymmetical, and more than a firewall would impose, I think.

So my summary is: I can understand the asymmetry between Stanford - ANL due to the routing, is there any hope to get the two rates (or routes) closer to one an other? Somehow have the ANL traffic take the I2 route in both directions (or take ESnet in both directions- seems like politics to make either of those things happen). I ask as the tool the researcher uses is one of those simple NX clients which doesn't work great when the back and forth speed is that far apart. Related, but now more informational, is why the ANL-SLAC traffic, over ESnet seems to have asymmetry but for unknown reasons.

Engaging ANL, 1:48pm 5/30/2013

Phil and Les: I am bringing Corey Hall of ANL into this conversation, to assist from the ANL side.

Corey: I am working with Phil Reese of Stanford University and Les Cottrell of SLAC to diagnose a network performance problem between ANL and SLAC. The problem appears most acute in the direction from ANL to Stanford. In doing some perfSONAR tests, we're seeing differences in performance between the ESnet pS node just outside of ANL (normal) and some of the pS nodes within ANL (degraded in the ANL-->Stanford direction). See inline below and I'll send you previous emails from the thread to get you caught up.

Response from ANL with information on ANL, 6:45am 5/31/2013

Hi All:

I will be happy to help out with this. I would like to force all the traffic from Argonne to Stanford to go through ESnet. What are the networks and AS for the Stanford nets?

I think there is some confusion on which perfSONAR boxes are inside and outside our firewall:

anlborder-ps.it.anl.gov (outside all firewalls connected directly to the Argonne border router 10Gbps) ndt.anl.gov (behind lab-wide firewall (Cisco ASA5580 1Gbps)

perfsonar.gmca.aps.anl.gov ( behind lab-wide firewall (Cisco ASA5580), and APS divisional firewall (Sidewinder 10Gbps)

I also had trouble running NDT tests to the ndt.anl.gov tester. A reboot seems to have fixed that.

Also, we are seeing some "overrun" errors on the Lab-wide firewall on both the inside and outside interfaces. I believe that enabling flow-control on the firewall interfaces will correct that issue. I'll make the request for change today and implemented it early next week.

AS' and recommendation from ESnet to make routes symmetric, 10:00am 5/31/2013

Here are the prefixes ESnet is getting from Stanford (AS32), via a direct 100G peering with CENIC (AS2152):

Wiki Markup
128.12.0.0/16      \*\[\] 7w1d 16:06:45, MED 10, localpref 4000,
from 134.55.200.146
AS path: 2153 32 I
> to 134.55.49.1 via xe-1/3/0.5
171.64.0.0/14      \*\[\] 7w1d 16:06:45, MED 10, localpref 4000,
from 134.55.200.146
AS path: 2153 32 I
> to 134.55.49.1 via xe-1/3/0.5

Wiki Markup
2607:f6d0::/32     \*\[\] 7w1d 16:07:51, MED 10, localpref 4000,
from 134.55.200.146
AS path: 2153 32 I
> to fe80::ea4:2ff:fe5b:3401 via xe-1/3/0.5

If you start preferring these routes from ESnet, we should end up with a
symmetric path from ANL to Stanford and back. That may not solve the
throughput issues, but it will give us a place to start troubleshooting.

Thanks also for the clarification on anlborder-ps. I'll run some more
tests today and send the results.

being slower makes sense.

What still has me stumped is this throughput graph from anlborder-ps.it.anl.gov to a SLAC PS.
http://anlborder-ps.it.anl.gov/serviceTest/bandwidthGraph.cgi?url=http://localhost:8085/perfSONAR_PS/services/pSB&key=f900976535c9051c3d9251e0301335c2&sTime=1367294236&eTime=1369886236&keyR=f6122888ee44db6d8691acc7bc37a2dd&src=slac-pt1.es.net&dst=anlborder-ps.it.anl.gov&srcIP=198.129.254.134&dstIP=130.202.222.58

This appears to be all ESnet but yet is asymmetical, and more than a firewall would impose, I think.

So my summary is: I can understand the asymmetry between Stanford - ANL due to the routing, is there any hope to get the two rates (or routes) closer to one an other? Somehow have the ANL traffic take the I2 route in both directions (or take ESnet in both directions- seems like politics to make either of those things happen). I ask as the tool the researcher uses is one of those simple NX clients which doesn't work great when the back and forth speed is that far apart. Related, but now more informational, is why the ANL-SLAC traffic, over ESnet seems to have asymmetry but for unknown reasons.

Engaging ANL, 1:48pm 5/30/2013

Phil and Les: I am bringing Corey Hall of ANL into this conversation, to assist from the ANL side.

Corey: I am working with Phil Reese of Stanford University and Les Cottrell of SLAC to diagnose a network performance problem between ANL and SLAC. The problem appears most acute in the direction from ANL to Stanford. In doing some perfSONAR tests, we're seeing differences in performance between the ESnet pS node just outside of ANL (normal) and some of the pS nodes within ANL (degraded in the ANL-->Stanford direction). See inline below and I'll send you previous emails from the thread to get you caught up.

Response from ANL with information on ANL, 6:45am 5/31/2013

Hi All:

I will be happy to help out with this. I would like to force all the traffic from Argonne to Stanford to go through ESnet. What are the networks and AS for the Stanford nets?

I think there is some confusion on which perfSONAR boxes are inside and outside our firewall:

anlborder-ps.it.anl.gov (outside all firewalls connected directly to the Argonne border router 10Gbps) ndt.anl.gov (behind lab-wide firewall (Cisco ASA5580 1Gbps)

perfsonar.gmca.aps.anl.gov ( behind lab-wide firewall (Cisco ASA5580), and APS divisional firewall (Sidewinder 10Gbps)

I also had trouble running NDT tests to the ndt.anl.gov tester. A reboot seems to have fixed that.

Also, we are seeing some "overrun" errors on the Lab-wide firewall on both the inside and outside interfaces. I believe that enabling flow-control on the firewall interfaces will correct that issue. I'll make the request for change today and implemented it early next week.

AS' and recommendation from ESnet to make routes symmetric, 10:00am 5/31/2013

Here are the prefixes ESnet is getting from Stanford (AS32), via a direct 100G peering with CENIC (AS2152):

Wiki Markup
128.12.0.0/16      \*\[\] 7w1d 16:06:45, MED 10, localpref 4000,
from 134.55.200.146
AS path: 2153 32 I
> to 134.55.49.1 via xe-1/3/0.5
171.64.0.0/14      \*\[\] 7w1d 16:06:45, MED 10, localpref 4000,
from 134.55.200.146
AS path: 2153 32 I
> to 134.55.49.1 via xe-1/3/0.5

Wiki Markup
2607:f6d0::/32     \*\[\] 7w1d 16:07:51, MED 10, localpref 4000,
from 134.55.200.146
AS path: 2153 32 I
> to fe80::ea4:2ff:fe5b:3401 via xe-1/3/0.5

If you start preferring these routes from ESnet, we should end up with a
symmetric path from ANL to Stanford and back. That may not solve the
throughput issues, but it will give us a place to start troubleshooting.

Thanks also for the clarification on anlborder-ps. I'll run some more
tests today and send the results.

Stanford requirement 10:37, 5/31/2013.

Michael's info does describe Stanford's AS etc. Do let me know if you need any more.

There is an upcoming GMCA run scheduled for Brian Kobilka's Stanford lab on June 6, which is what prompted this look into the net in the first place. So if possible, it would be great to shift the routing before then.

Thanks much,
Phil

PS- the ANL info came from the GMCA group, so I'll let them know the list is out of order.

Summary for Linda from Michael, 10:45am 5/51/2013

Hi Linda:

Thanks for the offer! I can send you other parts of the email thread to catch you up, but we're trying to troubleshoot an issue between the Stanford campus and ANL. I am not yet sure if the issue is within ANL or not. We seem to have a clean path between the ANL ESnet router and the SLAC and Sunnyvale ESnet routers, but there are routing asymmetries between ANL proper and Stanford that are making it harder to isolate the issue. I brought in Corey because I wanted to understand how the various perfsonar nodes were set up, and I was having some problems using ndt.anl.gov.

What we're seeing is degraded throughput, mainly in the ANL to Stanford direction. Phil has confirmed that this is Stanford campus and not SLAC, and the campus is connected via CENIC. Currently, ANL gets to Stanford via MREN to Internet2 to CENIC to Stanford. (Note that this is based on an inference I am making by doing a reverse traceroute from anlborder-ps.it.anl.gov to UC Berkeley, since I still have access to machines there. If anyone at ANL wants to provide some traceroutes to Stanford, e.g. rcf-perfsonar.stanford.edu, that would be great.) Traffic in the Stanford to ANL direction goes via the 100G CENIC-ESnet peering in Sunnyvale and then stays on ESnet until arriving at ANL.

Corey, how is anlborder-ps connected? Does it connect directly to your border router? What kind of router is it?

Here's what I am seeing between the anlborder-ps and ESnet's pS box near
SLAC:

bwctl -L 1500 -i 2 -t 30 -f m -c anlborder-ps.it.anl.gov
bwctl: Using tool: iperf
bwctl: 47 seconds until test results available

Wiki Markup
RECEIVER START
bwctl: exec_line: iperf \-B 130.202.222.58 \-s \-f m \-m \-p 5001 \-t 30 \-i 2
bwctl: start_tool: 3579008575.131517
\-----------------------------------------------------------\-
Server listening on TCP port 5001
Binding to local address 130.202.222.58
TCP window size: 0.08 MByte (default)
\-----------------------------------------------------------\-
\[ 14\] local 130.202.222.58 port 5001 connected with 198.129.254.134 port
5001
\[ ID\] Interval       Transfer     Bandwidth
\[ 14\]  0.0\- 2.0 sec  11.1 MBytes  46.4 Mbits/sec
\[ 14\]  2.0\- 4.0 sec   190 MBytes   796 Mbits/sec
\[ 14\]  4.0\- 6.0 sec   504 MBytes  2115 Mbits/sec
\[ 14\]  6.0\- 8.0 sec   513 MBytes  2153 Mbits/sec
\[ 14\]  8.0-10.0 sec   504 MBytes  2115 Mbits/sec
\[ 14\] 10.0-12.0 sec   512 MBytes  2146 Mbits/sec
\[ 14\] 12.0-14.0 sec   507 MBytes  2127 Mbits/sec
\[ 14\] 14.0-16.0 sec   509 MBytes  2135 Mbits/sec
\[ 14\] 16.0-18.0 sec   510 MBytes  2141 Mbits/sec
\[ 14\] 18.0-20.0 sec   507 MBytes  2127 Mbits/sec
\[ 14\] 20.0-22.0 sec   511 MBytes  2143 Mbits/sec
\[ 14\] 22.0-24.0 sec   507 MBytes  2125 Mbits/sec
\[ 14\] 24.0-26.0 sec   510 MBytes  2140 Mbits/sec
\[ 14\] 26.0-28.0 sec   508 MBytes  2132 Mbits/sec
\[ 14\] 28.0-30.0 sec   509 MBytes  2136 Mbits/sec
\[ 14\]  0.0-30.1 sec  6833 MBytes  1906 Mbits/sec \[ 14\] MSS size 1460 bytes (MTU 1500 bytes, ethernet)
bwctl: stop_exec: 3579008620.204830

Wiki Markup
RECEIVER END
\[\]$ bwctl \-L 1500 \-i 2 \-t 30 \-f m \-s anlborder-ps.it.anl.gov
bwctl: Using tool: iperf
bwctl: 158 seconds until test results available

Wiki Markup
RECEIVER START
bwctl: exec_line: /usr/bin/iperf \-B 198.129.254.134 \-s \-f m \-m \-p 5093 \-t 30 \-i 2
bwctl: start_tool: 3579008749.906801
\-----------------------------------------------------------\-
Server listening on TCP port 5093
Binding to local address 198.129.254.134 TCP window size: 0.08 MByte (default)
\-----------------------------------------------------------\-
\[ 15\] local 198.129.254.134 port 5093 connected with 130.202.222.58 port
5093
\[ ID\] Interval       Transfer     Bandwidth
\[ 15\]  0.0\- 2.0 sec  2.69 MBytes  11.3 Mbits/sec \[ 15\]  2.0\- 4.0 sec  4.57 MBytes  19.2 Mbits/sec
\[ 15\]  4.0\- 6.0 sec  25.9 MBytes   109 Mbits/sec
\[ 15\]  6.0\- 8.0 sec  26.7 MBytes   112 Mbits/sec
\[ 15\]  8.0-10.0 sec  17.8 MBytes  74.5 Mbits/sec
\[ 15\] 10.0-12.0 sec  60.2 MBytes   253 Mbits/sec
\[ 15\] 12.0-14.0 sec   197 MBytes   828 Mbits/sec
\[ 15\] 14.0-16.0 sec  81.4 MBytes   342 Mbits/sec
\[ 15\] 16.0-18.0 sec   219 MBytes   917 Mbits/sec
\[ 15\] 18.0-20.0 sec   467 MBytes  1957 Mbits/sec
\[ 15\] 20.0-22.0 sec   495 MBytes  2078 Mbits/sec
\[ 15\] 22.0-24.0 sec   495 MBytes  2076 Mbits/sec
\[ 15\] 24.0-26.0 sec   500 MBytes  2098 Mbits/sec
\[ 15\] 26.0-28.0 sec   494 MBytes  2072 Mbits/sec
\[ 15\] 28.0-30.0 sec   494 MBytes  2072 Mbits/sec
\[ 15\]  0.0-30.0 sec  3583 MBytes  1002 Mbits/sec \[ 15\] MSS size 1460 bytes (MTU 1500 bytes, ethernet)
bwctl: stop_exec: 3579008795.945836

RECEIVER END

Traffic in the slac-pt1-->anl direction looks reasonable, but traffic in the opposite direction show some evidence of really slow slow-start and maybe even some packet loss. I'd like to see how the pS box is connected to try to get to the root of the throughput asymmetry.

thanks,
michael

ANL reports the routes are symmetric, 11:15am, 5/31/2013

We are now preferring ESnet for this traffic. Corby Schmitz made the changes this morning at 9:28AM CDT. The anlborder-ps is directly connected to our Brocade MLXe-16, the same router that has the 100G connection to ESnet5. We will be making the flow-control changes the firewall interfaces on June 5th at 7:15AM CDT.

4/3 Up Forward Full 10G None No level0 0024.388d.7300
PerfSonar10G default-port

SSH@Ugli#sh int e 4/3
10GigabitEthernet4/3 is up, line protocol is up
STP Root Guard is disabled, STP BPDU Guard is disabled
Hardware is 10GigabitEthernet, address is 0024.388d.7300 (bia
0024.388d.7392)
Configured speed 10Gbit, actual 10Gbit, configured duplex fdx, actual fdx
Member of Control VLAN 4095, VLAN 661 (untagged), port is in untagged mode, port state is Forwarding
STP configured to ON, Priority is level0, flow control enabled
Priority force disabled, Drop precedence level 0, Drop precedence force disabled
dhcp-snooping-trust configured to OFF
mirror disabled, monitor disabled
LACP BPDU Forwarding:Disabled
Not member of any active trunks
Not member of any configured trunks
Port name is PerfSonar10G
MTU 9216 bytes, encapsulation ethernet
Cluster L2 protocol forwarding enabled
300 second input rate: 156404471 bits/sec, 12979 packets/sec, 1.58% utilization
300 second output rate: 264419 bits/sec, 512 packets/sec, 0.00% utilization
23972298690 packets input, 30428277232128 bytes, 0 no buffer
Received 8671 broadcasts, 32 multicasts, 23972289987 unicasts
1 input errors, 1 CRC, 0 frame, 0 ignored
0 runts, 0 giants
NP received 23972298697 packets, Sent to TM 23972298535 packets
NP Ingress dropped 162 packets
25242121354 packets output, 35160675952950 bytes, 0 underruns
Transmitted 998151 broadcasts, 10725982 multicasts, 25230397221 unicasts
0 output errors, 0 collisions
NP transmitted 25242121367 packets, Received from TM 25245125413 packets

Thanks,

Corey


Corey Hall
Design & Implementation Manager, Network Communications Computing and Information Systems Argonne National Laboratory
9700 South Cass Avenue, Bldg. 240 Rm. 4107 Argonne, IL 60439