Log files

Many of the scripts produce logfiles. Basically these are the crontab re-directed STDOUT output from the scripts. At SLAC these are saved on the host iepm-bw.slac.stanford.edu in the directory /u1/myswl/logs/.  The crontab appears (May 29th 2007) as:

#kill all servers to clean up any hung ones and to rotate the logs
5 0 * * * /afs/slac/package/netmon/bandwidth-tests/v3src/kill-all-servers \
  >> /u1/mysql/logs/kill-all-servers 2>&1
# restart the servers
10 0 * * * /afs/slac/package/netmon/bandwidth-tests/v3src/restart-all-servers \
  >> /u1/mysql/logs/restart-all-servers.today 2>&1
# copy and date the logs for the day
7 0 * * * /afs/slac/package/netmon/bandwidth-tests/v3src/copylogs /u1/mysql/logs/ \
  > /tmp/iepmbw-logcopy 2>&1
#
# back up the data base
15 0 * * * \
  /afs/slac/package/netmon/bandwidth-tests/v3src/backup-iepm-mysql-database \
  /nfs/slac/g/net/iepm-bw/iepm-bw.slac.stanford.edu/mysql-backup \
  >> /u1/mysql/logs/backup-iepm-mysql-database.today 2>&1
#
# rotate and date the backups
0 3 * * * \
  /afs/slac/package/netmon/bandwidth-tests/v3src/copylogs /nfs/slac/g/net/iepm-bw/iepm-bw.slac.stanford.edu/mysql-backup \
  >> /u1/mysql/logs/mysql-backup.today 2>&1
#
# run keepalive check
5,15,25,35,45,55 * * * * /afs/slac/package/netmon/bandwidth-tests/v3src/keep-em-alive \
  >> /u1/mysql/logs/keep-em-alive.today 2>&1
#
# run keep server alive check
1,11,21,31,41,51 * * * * /afs/slac/package/netmon/bandwidth-tests/v3src/keep-servers-alive \
  >> /u1/mysql/logs/keep-servers-alive.today 2>&1
#
# cleanup hung clients
3,13,23,33,43,53 * * * * /afs/slac/package/netmon/bandwidth-tests/v3src/bw-cleanup \
  >> /u1/mysql/logs/bw-cleanup.today 2>&1
#
# run the analyses
#23 1,3,5,7,9,11,13,15,17,19,21,23 * * * \
#  /afs/slac/package/netmon/bandwidth-tests/v3src/post-test-processing-script \
#  -g 1 \
#  >> /u1/mysql/logs/post-test-processing-script.today 2>&1
#23 1,5,9,13,17,21 * * * /afs/slac/package/netmon/bandwidth-tests/v3src/post-test-processing-script \
#  -g 1 \
#  >> /u1/mysql/logs/post-test-processing-script.today 2>&1
23 1,6,11,16,21 * * * /afs/slac/package/netmon/bandwidth-tests/v3src/post-test-processing-script \
  -g 1 \
  >> /u1/mysql/logs/post-test-processing-script.today 2>&1
#
# run the overnight analysis
15 3 * * * /afs/slac/package/netmon/bandwidth-tests/v3src/overnight-processing-script \
  > /u1/mysql/logs/overnight-processing-script.today 2>&1
#
# run the trace analysis
10 * * * * /afs/slac/package/netmon/bandwidth-tests/v3src/traceanal/traceanal -d today -i 0 \
  >> /u1/mysql/logs/traceanal.today 2>&1
#
#Run the new bandwidth change analysis code
5 2,6,10,14,18,22 * * * /afs/slac/package/netmon/bandwidth-tests/v3src/alerts/analyze-for-alerts  \
  -t "iperf,pathchirp,pathload,miperf,tlaytcp" -p "iperf,pathchirp,thrumin,miperf,tlaytcp" \
  >> /u1/mysql/logs/analyze-for-alerts.today 2>&1
#
# run historical alerts web page
15 2,6,10,14,18,22 * * * /afs/slac/package/netmon/bandwidth-tests/v3src/report-alerts \
  >> /u1/mysql/logs/report-alerts.today 2>&1

Of particular interest (i.e. should be reviewed if any problems are found in the HTML reports) are:

/u1/mysql/logs/post-test-processing-script.today
/u1/mysql/logs/bw-cleanup.today #Tells what processes needed killing

The entries for these logfiles are time stamped, the script originating the log is identified and the message is preceded with a question mark (question) for a warning and an exclamation mark (!) for a serious error. The question mark and exclamation mark are to facilitate looking for more imprtant events using grep.

Each night logcopy renames the .today file with .yesterday, the .yesterday with yesterday's date, and removes files older than 7 days. There are typically about 40 .today logfiles.

  • No labels