Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

Design

...

:

DB

Gliffy Diagram
sizeL
nameDB Schema 2
pageIEPM:PerfSONAR_PS File Transfer MA
pageid57348062
aligncenter
spaceIEPM
Gliffy Diagram
sizeL
nameFilesetFilesAndChunks2
pageIEPM:PerfSONAR_PS File Transfer MA
pageid57348062
aligncenter
spaceIEPM

Xml Schema

file-transfer.rnc

http://confluence.slac.stanford.edu/display/IEPM/FTMA+Design

Source:

Pre-Req:

You might need to install the following perl modules if not already installed. Easiest way to do this is to use cpan e.g. "cpan module_name"

Perl:

  • Module::Load
  • HTTP::Daemon
  • XML::SAX
  • Config::General
  • aliased
  • Readonly
  • Term::ReadKey
  • DBI::DBD
  • DBD::mysql
  • DBD::SQLite
  • Class::Accessor
  • Class::Fields
  • Params::Validate
  • Statistics::Descriptive
  • Data::UUID
  • IO::Interface
  • DateTime
  • Error
  • Date::Manip (this may require perl upgrade to 5.10 or above, directions)
  • Log::Dispatch::FileRotate
  • Log::Log4perl

Additionally more components may be required, which may come up when one tries to run the startup script of the service (section "How to run the service"):

  • XML::LibXML

Installation:

Theoretically, the service can work with any kind of File Transfer protocol. For test purposes and limiting the large amount of possibilities we use GridFTP as a primary File Transfer Tool.

...

  • There are a number of online resources, that can help an administrator, install GridFTP. Some of them are:
  • A bug in GridFTP causes intervals to be calculated incorrectly unless you use a very recent version of GridFTP or the subversion trunk version of NetLogger.
  • There is generally no restriction on the version of GridFTP to be used, however, it is advised to use the latest one, with minimum security risks.

NetLogger Installation

NetLogger forms the basis of Data collection in this service. A detailed description of installing NetLogger can be found by following this URL: http://acsconfluence.slac.lblstanford.govedu/NetLogger-releasesdisplay/doc/trunk/manual.html#python_installImage Removed
The python install section current as of Jan 26 2010, is included here for quick reference.

Install Python Version

Prerequisites

The following Python modules may be needed by the NetLogger pipeline to interact with the database. To install these modules, either use a package manager such as Debian's APT, the RedHat/etc. yum, FreeBSD ports, etc., use Python's easy_install command from setuptools or download and install from source. The easy_install command and download URL are given below.

...

MySQLdb for MySQL

...

easy_install MySQLdb

web site: http://sourceforge.net/projects/mysql-pythonImage Removed

...

psycopg2 or pgdb for PostgreSQL

...

easy_install psycopg2

web site: http://www.initd.org/pub/software/psycopg/Image Removed

Install

Below are instructions for installing the Python instrumentation API and tools.

...

Install from source

...

  • Unpack sources
    • tar xzvf netlogger-python-VERSION.tar.gz
    • cd netlogger-python-VERSION
  • Run Python's standard install sequence
    • python setup.py build
    • python setup.py install
  • Alternately, to install under "$NETLOGGER_HOME"
    • export NETLOGGER_HOME=/my/path # or use setenv on csh
    • python setup.py install --home=$NETLOGGER_HOME
    • *export PYTHONPATH=$NETLOGGER_HOME/lib/python *
    • *export PATH=$PATH:$NETLOGGER_HOME/bin *

Patching NetLogger to hadle GridFTP logs(buggy/nob-buggy):

To patch your NetLogger distribution, you could just drop this file into the download directory under python/netlogger/parsers/modules/ , then reinstall (e.g. python setup.py install).

IEPM/Installing+NetLogger

PerfSONAR_PS FTMA

Download the Source Code as a tar ball

  • Code Block
    
    tar \-xvf FileTransfer_MA.1b.tar
    
  • * mkdir /var/log/perfsonar *
  • Run the SQL Procedure(mysql) on the netlogger database to produce a secondary database. This secondary database is the one, the FTMA service will be interacting with.

...

  • Checkout the latest version of FTMA:
    • With username and password:
  • Code Block
    
    svn checkout https://svn.internet2.edu/svn/perfSONAR-PS/branches/FileTransfer/
    Image Removed *
    OR
    • Anonymously:
  • Code Block
    
    svn checkout https://anonsvn.internet2.edu/svn/perfSONAR-PS/branches/FileTransfer/
    Image Removed *
     

Configuration

  • Create the log Directory
    Code Block
    
     mkdir 
    * mkdir
    /var/log/perfsonar
    *
     (Create the log directory as is shown by daemon_logger.conf)
  • Create a mysql database for FTMA service to interact with:
    Code Block
    
    mysql -u $USER -p $PASS -e "create database ft_ma"
  • Run the SQL Procedure(mysql) on the netlogger database to produce a secondary database.

This secondary database is the one, the FTMA service will be interacting with. This procedure has the ability to keep checkpoints, so it is supposed to be run, after a certain amount of time has passed. Ideally this can be run every hour or day, to convert NetLogger data, to be available to the user using FTMA service.
Basic Idea behind this procedure, is to convert GridFTP logs from the schema, as is stored by NetLogger to a schema, defined at the start of this page. In order to do so we have two procedures, a master procedure called make_FTDB. This procedure handles the checkpoints, so that the script, starts, from the point, where it stopped on last execution. This procedure also calls the conversion procedure called build_FTDB, with eventID of the tuple to be converted. build_FTDB, picks up all the relevant attributes, from attr table and then inserts the values into EndPoints table, Metadata table, MetaEvent table and Data table. The procedure also creates table if they don't exsist.

Settings for bin/daemon.conf file

Change "db_name", "db_username" and "db_password" in the following file:

Code Block
xml
xml
titlebin/daemon.conf
borderStylesolid
max_worker_

...

lifetime      360
max_worker_

...

processes     30
disable_echo     0
ls_registration_

...

interval     60
ls_instance     http://

...

localhost:9995/perfSONAR_PS/services/hLS

...


root_hints_

...

url    http://www.perfsonar.net/gls.root.hints

...


<port 9000>

...


	<endpoint /perfSONAR_PS/services/FT/MA>

...


		service_

...

type     MA
		module    perfSONAR_PS::Services::MA::FT

...


		<ftma>
			service_description    FT MA
			service_accesspoint     http://localhost:9000/perfSONAR_PS/services/FT/MA

...


			enable_

...

registration     0
			service_name    perfSONAR_PS FT

...

 MA
			ls_registration_

...

interval     60
			service_timeout     360
			query_size_limit     100
                        db_host     localhost
                        db_username     root
                        db_name     ft_ma
                        db_password   ****
			db_type     mysql
		</ftma>
	</endpoint>
</port>

How to run the Service:

  • cd bin
  • sh FtpMaExecute.sh --skip-input || --help
    • Attempts to install the missing, perl modules.
    • Stops any previous running instance of the service.
    • Creates a backup of any existing log file.
    • Everytime the script executes, it copies the log file from the main log directory to an hourlyfolder in the same directory. The folder is named as: Ftp_MM-DD-YY:HH
    • Starts a new Instance of the service by running daemon.pl

Client Application:

The service includes a client tool to do some preliminary testing. This tool is present inside the bin dir as well.

How to run the Service:

Before running the service, we must have the database. So far we only created the database and did not copy any tables/contents. To do this we must have a file that performs sql functions of adding and filling tables. The file should be a part of the FTMA package retrieved via svn checkout and should be placed inside "FileTransfer/perfSONAR_PS-FileTransfer/contrib" directory. The file can used in the following manner (you might need to do sudo):

If mysql not started then start it:

Code Block

/etc/init.d/mysqld start

Load the mysql dump into a database:

Code Block

mysql -D ft_ma < mysql_backup

Once done do the following to start the service:

Code Block

* cd contrib
* sh FT_MA_Startup.sh \--skip-input || \--help
  • sh FtpMaExecute.sh --skip-input || --help
    o Attempts to install the missing, perl modules. o
  • Stops any previous running instance of the service.
  • o Creates a backup of any existing log file.
  • o Everytime the script executes, it copies the log file from the main log directory to an hourlyfolder hourlyfolder in the same directory. The folder is named as: Ftp_MM-DD-YY:HH o
  • Starts a new Instance of the service by running daemon.pl

Client Application:

The service includes a basic client tool as well.client tool to do some preliminary testing. This tool is present inside the bin dir.

  • Code Block
    
    perl FT
    perl FTP
    _client.pl

This simple execution will fetch all the metadata from the service, and provide the user with a final output showing all the metadata keys mapped on source and destination ip addresses.

  • Code Block
    
    perl FTP_client.pl --help
    
    
    -d Switch to debug level, one of 0,1 or 2

...

  • 
    --debug Same as -d

...

  • 
    --url Url to the MA Service(FT)

...

  • 
    default is localhost

...

  • 
    --data Output Data as well

...

  • 
    --src source ip (string)

...

  • 
    --dst destination ip (string)

...

  • 
    --SrcPath metadata param: Source file path(string)

...

  • 
    --DestPath metadata param: Destination file path(string)

...

  • 
    --stripes metadata param: number of stripes

...

  • 
    --buffer metadata param: bufer size

...

  • 
    --block metadata param: block size

...

  • 
    --streams metadata param: number of streams

...

  • 
    --program metadata param: program used for file transfer

...

  • 
    --user metadata param: user, who requested the file transfer

...

  • 
    --initEpochTime initial Time limit in Epoch (integer)

...

  • 
    --finalEpochTime Final Time limit in Epoch (integer)

...

  • 
    --initUtcTime initial Time limit in UTC (string)

...

  • 
    --finalUtcTime final Time limit in UTC (string)

...

  • 
    --startTuple start point of results to return

...

  • 
    --tupleLimit number of tuples to return

...

  • 
    -h Print this help

...

  • 
    --help Same as -h
    
  • Code Block
    
    perl FTP_client.pl -d --data --DestPath=/ --stripes=1 --src=192.168.117.128 --dst=131.225.107.12 --initEpochTime=1220000000 --finalEpochTime=1225408002

...

o This execution will fetch metadata where destination path, stripes, source ip address and destination ip address is defined by the parameters passed.
o The --data argument makes sure that the returned metadata is then used to fetch data with start time and end time within the given limitation.
o The -d argument provides the debug data as well.

  • Code Block
    
    FT_client.pl -d 2 --data --stripes=1 --user=dang --startTuple=400 --tupleLimit=20