Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

This page describes various monitoring tools used for the Fermi Xrootd cluster.

Ganglia

  • Ganglia Xrootd shows various metrics related to memory, I/O and load of the xrootd cluster.
  • The redirectors are not shown in the xrootd Ganglia page but are part of the glastlnx cluster: glastlnx04 and glastlnx05

Disk Usage and Files

HPSS

Table of Contents
maxLevel2
outlinetrue
stylenone

Server Disk Usage

The xrootd data server disk space is obtained twice a day and written to a nfs log directory. From these values a simple table with the last usage values is created:

disk usage

The data are also loaded to an Oracle database and are accessible through a web application (currently in test stage):

diskusage web app

Cron job that collects the disk space info

The cronjob, running in crontab on each xrootd data server, is

/opt/xrootd/admin/mon_diskspace.sh

It collects the total and free diskspcae as well as the inode usage for ufs file systems (zfs has no inode restrictions).

The disk usage is stored in files:

.bq /nfs/farm/g/glast/u15/xrootd/diskspace/df_server_YYYYMM

...

  • /nfs/farm/g/glast/

...

  • u18/xrootd/

...

  • hpss/

...

  • file_listing/

    Contains the listing of files in HPSS. The current format is hl_<topdir><date> where <topdir> is the directory below all files are listed. The common _/glast prefix is omitted. The listing contains filename, the size of a file in bytes and the date the file was put into HPSS. The date format is YYYYMMDDHHMMSS.

Misc

Each line in these files shows the disk usage for a particular date. The format is:

Wiki Markup
bd. DF  <date>  <server> <totalSpace>  <freeSpace>   <%Used>  \[<inodesFree>  <%inodesFree>\]

The foramt of date is YYYYMMDDTHHMMSS, e.g. 20080712T123258. The totalSpace and freeSpace are in GB. The inode info is only shown for serves that use the ufs file system (all sulkies) but not for the servers that employ zfs.
In order to calculate the free disk and total disk space of the xrootd cluster the values, taken around the same time , for all data servers have to be summed.

Data in Oracle

The disk space usage values are stored in Oracle. A key-value together with the date is recorded. The following key names are used:

Key

Description

xrootd_freeGB_<server>

free disk space in GB for <server>, e.g.: xrootd_freeGB_wain021

xrootd_diskGB_<server>

disk size in GB for <server>

xrootd_usedGB_<server>

used disk space in GB for <server>, (xrootd_diskGB_server> - xrootd_freeGB_<server>)

xrootd_all_freeGB

free disk space in GB summed over all xrootd data servers.

xrootd_all_diskGB

disk size in GB summed over all xrootd data servers.

xrootd_all_usedGB

used disk space in GB summed over all xrootd data servers, (xrootd_all_diskGB - xrootd_all_freeGB)

...