Purpose
The strategies being adopted to analyze and store the unprecedented volumes of data being gathered by current and future High Energy and Nuclear Physics (HENP) experiments is the coordinated deployment of Grid technologies such as those being developed for the Particle Physics Data Grid and the GridPhysics Network. It is anticipated that these technologies will be deployed at hundreds of institutes that will be able to search out and analyze information from an interconnected worldwide grid of tens of thousands of computers and storage devices. This in turn will require the ability to sustain over long periods the transfer of large amounts of data between collaborating sites with relatively low latency.
The purpose of the IEPM-BW project is to develop and deploy an infrastructure to make active end-to-end network data transfers and network performance measurements for high performance network links such as are used worldwide by Grid applications and other academic and research (A&R) applications deployed over high performance network such as ESnet, Internet2 and other Academic and Research (A&R) networks in the developed world.
Novel Ideas
- Provides complementary low impact and more intense (although short) detailed perfomance measurements:
- The low impact provides network performance measurements to most of the Internet connected world providing delays, loss and connectivity information over long (many years) time periods.
- The higher impact measurements are oriented to high performance links (e.g. grid sites, ESnet and Internet 2 connected sites) and provide Network throughput performance measurements allowing comparisons, identification of bottlenecks etc.
- Uses both passive and active measurements
- Continuous, robust, measurement, analysis and web based reporting of results and data available world wide.
- Simple infrastructure enabling rapid deployment, locating within an application host, and local site management to avoid security issues
- Provide simple forecasting for applications and for optimizing the frequency of measurements.
Tasks
The following are the major tasks:
- Develop/deploy a simple, robust, administratively ssh based network measurement and management infrastructure.
- Install/integrate a base set of measurement tools into the infrastructure make regular measurements and record the results. These tools currently include: ping, traceroute, iperf, thrulay, OWAMP, pathchirp, and pathload
- Compare and validate the various tools and determine the regions of applicability.
- Develop data reduction, analysis, reporting, forcasting and archiving tools.
- Facilitate evaluating new TCP stacks such as HTCP, High Speed TCP, and FAST and compare with the default stacks.
- Provide access to the data for research, forecasting, validation, and further analysis.
Impact
These measurements will supplement the uses that the PingER measurements are used for and in particular will be critical for:
- Providing planning information to applications, grid and network planners by:
- Providing trouble shooting information to networks and users by:
- Indicating when there are incremental or sudden changes and the magnitude of the changes, and providing alerts.
- Helping pin-point whether a performance issue is at the network layer or application layer, or at some sub-component such as a disk.
- Providing networkers and applications developers a better understanding of how networks and applications work together by providing:
- Validation/correlation of how network performance relates to delays and loss performance (e.g. bandwidth estimators).
- Assist users in selecting the optimum network (e.g. windows, streams, QoS) and application (e.g. compression) configuration options.
- Identifying the critical bottlenecks such as cpu speed, operating system, network bandwidth etc., for high throughput application performance.
- Provide a public domain network performance data base, together with analyses and navigable reports from active monitoring. This information can be used for further research, for predictions and for application steering.
Results
The data and analyzed results are made available publicly via the web in graphical, tabular and downloadable form. Results include:
- Time series data with various aggregation time scales, Scatter plots between metrics.
- Traceroutes for the paths.
- Congestion Windows and Throughput from SLAC to different nodes from SLAC.
- Utilization of SLAC Internet connections.
- Forecasting.
- Time Series Behavior Patterns (Diurnal & Step)
- Comparisons between Monitors (also contains windows & streams history)
- IEPM-BW Infrastructure information
- IEPM Deployment
- Methodology, Experiences in developing IEPM-BW, Program Logic Manual.
- Tabular information of Remote host Configurations and BW-Tests's Probe Parameters,
- Requirements for monitoring and remote hosts, setting up accounts on remote hosts, porting the monitoring code.
- Data format.
- Using Kerberized SSH to access FNAL,
- Iperf QUICK mode, ongoing work to reduce the time needed by Iperf for bandwidth measurements.
- Internet Measurement Tool Evaluations/Comparisons
- Correlation of Web100, active and passive throughputs.
- Passive vs Active measurements
- Comparing the Available Bandwidth Estimator (ABwE) packet train estimator results with Iperf (2002)
- Internet Active End-to-end Measurement Infrastructure Comparisons
Comparison of Some Internet Active E2E Measurement Infrastructures Feb. 2004- Comparison of some Internet Active End-to-end Performance Measurement Projects, Jul. 1999
- Comparison of PingER and Surveyor
- Comparison of Surveyor and RIPE