Purpose
The strategies being adopted to analyze and store the unprecedented volumes of data being gathered by current and future High Energy and Nuclear Physics (HENP) experiments is the coordinated deployment of Grid technologies such as those being developed for the Particle Physics Data Grid and the GridPhysics Network. It is anticipated that these technologies will be deployed at hundreds of institutes that will be able to search out and analyze information from an interconnected worldwide grid of tens of thousands of computers and storage devices. This in turn will require the ability to sustain over long periods the transfer of large amounts of data between collaborating sites with relatively low latency.
The purpose of the IEPM-BW project is to develop and use an infrastructure to make active end-to-end application and network performance measurements for high performance network links such as are used worldwide by Grid applications and other academic and research (A&R) applications deployed over high performance network such as ESnet, Internet2 and other (A&R) networks in the developed world.
Novel Ideas
- Provides complementary low impact, overview and more intense, detailed perfomance measurements:
- The low impact provides network performance measurements to most of the Internet connected world providing delays, loss and connectivity information over long (many years) time periods.
- The higher impact measurements are oriented to high performance links (e.g. grid sites, ESnet and Internet 2 connected sites) and provide Network AND application high throughput performance measurements allowing comparisons, identification of bottlenecks etc.
- Uses both passive and active measurements
- Continuous, robust, measurement, analysis and web based reporting of results and data available world wide.
- Simple infrastructure enabling rapid deployment, locating within an application host, and local site management to avoid security issues
Provide simple forecasting for applications and for optimizing the frequency of measurements.
Tasks
The following are the major tasks:
- Develop/deploy a simple, robust, ssh based active end-to-end application and network measurement and management infrastructure.
Install/integrate a base set of measurement tools into the infrastructure make regular measurements and record the results. These tools include: ping, traceroute, iperf, pipechar, bbcp and bbftp. - Develop data reduction, analysis, reporting, forecasting and archiving tools.
- Compare and validate the various tools and determine the regions of applicability.
- Install new network (e.g. the INCITE tools, pathrate and pathload) and application (e.g. GridFTP) tools into the infrastructure, and use it to evaluate the performance of the tools and their relevancy.
Evaluate new TCP stacks such as HTCP, High Speed TCP, and FAST and compare with the default stacks. - Provide access to the data for research, forecasting, validation etc.
Impact
These measurements will supplement the uses that the PingER measurements are used for and in particular will be critical for:
Providing planning information to applications, grid and network planners by:
- Providing and understanding the achievable performance today in network throughput and application (file copy & ftp) throughput.
Providing historical information on growth, incremental and sudden changes, and patterns (e.g. diurnal) of changes in performance.
Providing input on how to improve measurement tools such as iperf.
*Providing trouble shooting information to networks and users by:
*Indicating when there are incremental or sudden changes and the magnitude of the changes, and providing alerts.
Helping pin-point whether a performance issue is at the network layer or application layer, or at some sub-component such as a disk.
Providing networkers and applications developers, a better understanding of how networks and applications work together by providing:
Validation/correlation of how network performance relates to delays and loss performance (e.g. bandwidth estimators).
Assist users in selecting the optimum network (e.g. windows, streams, QoS) and application (e.g. compression) configuration options.
Identifying the critical bottlenecks such as disk, cpu speed, operating system, network bandwidth etc., for high throughput application performance.
Provide a public domain network performance data base, together with analyses and navigable reports from active monitoring. This information can be used for further research, for predictions and for application steering.