As reported by email from the PingER cronjob we ran out of space on /nfs/slac/g/net/pinger/ (example). This was confirmed as seen below, however there was space in the parent directory in pingerdata.unite.
207cottrell@pinger:~$df /nfs/slac/g/net/pinger Filesystem 1K-blocks Used Available Use% Mounted on netfs03:/u2/g.net.pinger 845881344 837422080 0 100% /nfs/slac/g/net/pinger 206cottrell@pinger:~$df /nfs/slac/g/net/ Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup00-LogVol00 17001308 5586492 10551380 35% / ~$ls -l /nfs/slac/g/net/ total 1 drwxr-xr-x 2 root root 0 Dec 2 10:40 iepm-bw/ drwxrwsr-x 28 1049 iepm 1024 Jan 23 2014 pinger/ drwxr-xr-x 2 root root 0 Dec 2 10:40 pingerdata.unite/
Looking in more detail at:
$ls -l /nfs/slac/g/net/pinger/* > dir ls: cannot open directory /nfs/slac/g/net/pinger/lost+found: Permission denied Exit 2
Looking in dir, it is apparent that apart from the directories:
pingerlod/, pingerreports/, pingerdata/, pinger_mon_data/, pinger2/, tulip/
most of this data is old and from earlier projects. Rather than lose it we decided to move this other data from
/nfs/slac/g/net/pinger to /nfs/slac/g/net/pingerdata.unite/
hence preserving a copy if we run into problems.
We copied and deleted:
/u2/g.net.pinger/hep# /u2/g.net.pinger/hep--size /u2/g.net.pinger/iepm-bw /u2/g.net.pinger/shahryar /u2/g.net.pinger/slac-gateways /u2/g.net.pinger/traceroute This saved about 15GBytes then we deleted bandwidth-tests data iepm-bw.slac.stanford.edu monalisa nan-backup nettest2scratch node2.slac.stanford.edu pinger_mysql sc2002 sc2003 sc2004 223cottrell@pinger:~$df /nfs/slac/g/net/pinger Filesystem 1K-blocks Used Available Use% Mounted on netfs03:/u2/g.net.pinger 845881344 816465920 20956160 98% /nfs/slac/g/net/pinger i.e. we saved ~ 44Gbytes total
More saving 3/11/2016
Once again we are receiving:
/bin/mv: closing `/nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-1000-by-site-2016-03.txt.gz': No space left on device [cottrell@pinger ~]$ df /nfs/slac/g/net/pinger/pingerreports/hep/ Filesystem 1K-blocks Used Available Use% Mounted on netfs03:/u2/g.net.pinger 845881344 837422080 1024 100% /nfs/slac/g/net/pinger [cottrell@pinger ~]$ df /nfs/slac/g/net/pinger Filesystem 1K-blocks Used Available Use% Mounted on netfs03:/u2/g.net.pinger 845881344 837422080 1024 100% /nfs/slac/g/net/pinger
It appears there are old files in
[cottrell@pinger ~]$ ls -l /nfs/slac/g/net/pinger/pingerreports/ total 19 drwxrwsr-x 15 cottrell iepm 512 Oct 24 2009 --by/ drwxrwsr-x 2 cottrell iepm 512 Dec 5 2009 --date/ drwxrwsr-x 20 iepm iepm 11264 Dec 4 17:21 hep/ drwxrwsr-x 2 pinger iepm 512 Oct 25 2009 hep#/ drwxrwsr-x 2 pinger iepm 512 Jul 7 2012 hep--size/ drwxr-sr-x 15 iepm iepm 512 Jun 14 2005 hep-rest/ drwxr-sr-x 3 cottrell iepm 512 May 18 2006 hepc/ drwxr-sr-x 15 cottrell iepm 512 May 17 2006 heps/ drwxr-sr-x 16 pinger iepm 512 Mar 8 2012 new/
We believe that all apart from hep/ are not needed. However before we delete we want to make a copy somewhere else just in case. So we need to mkdir man cpman cppingerreports/ and cp --by/, --date/, hhep–size, hep-rest, epc/, heps/, new/ from /nfs/slac/g/net/pinger/pingerreports/ to /nfs/slac/g/net/pingerdata.unite/pingerreports/. then we need to
Use cp -r -p -v to preserve the mode, ownership and timestamps, recursively copy directories and explain what is being done.
Since this will take a lot of time (day or so) you may want to try cp -r -p -v <from> <to> >! log& and use top and tail log to watch progress
Use rm -r -v to remove files that have been copied from /nfs/slac/g/net/pinger/pingerreports/new/
More saving 8/31/2016
We are receiving
Your "cron" job /afs/slac/package/pinger/tulip/vtrace0chk.pl produced the following output: can't close tmp file=/nfs/slac/g/net/pinger/tulip/cachetr/cache_tmp.txt: No space left on device at /afs/slac/package/pinger/tulip/vtrace0chk.pl line 168. TRSrun@pinger: Command exited with value 28
Looking at the space used, we see
311cottrell@pinger:~$df -h /nfs/slac/g/net/pinger Filesystem Size Used Avail Use% Mounted on netfs03:/u2/g.net.pinger 807G 799G 0 100% /nfs/slac/g/net/pinger
To find the space in each subdirectory of /nfs/slac/g/net/pinger, we use
307cottrell@pinger:/nfs/slac/g/net/pinger$du -sh * du: cannot read directory `lost+found': Permission denied 1.0K lost+found 2.7G pinger2 3.0K pinger_mon_data 607G pingerdata 29G pingerlod 53G pingerreports 63M tulip Exit 1
pingerlod is no longer required, when/if it is restored we will move it to a new place
We used the following to copy the files and watch progress
270cottrell@pinger:~$mkdir /nfs/slac/g/net/pingerdata.unite/pingerlod 274cottrell@pinger:~$cp -r -p -v /nfs/slac/g/net/pinger/pingerlod /nfs/slac/g/net/pingerdata.unite/pingerlod >! log& [1] 32068 278cottrell@pinger:~$tail log `/nfs/slac/g/net/pinger/pingerlod/Aduna_Data/openrdf-sesame/logs/main-2013-08-26.log' -> `/nfs/slac/g/net/pingerdata.unite/pingerlod/pingerlod/Aduna_Data/openrdf-sesame/logs/main-2013-08-26.log' 279cottrell@pinger:~$wc log 105 316 20480 300cottrell@pinger:~$ps -efl | grep 32068 0 S cottrell 4837 17630 0 80 0 - 1107 - 15:48 pts/3 00:00:00 grep 32068 0 D cottrell 32068 17630 1 80 0 - 1389 - 15:33 pts/3 00:00:14 cp -r -p -v /nfs/slac/g/net/pinger/pingerlod /nfs/slac/g/net/pingerdata.unite/pingerlod
We then used rm -r -v to remove files that have been copied from /nfs/slac/g/net/pinger/pingerlod to remove the files however first we have to change the ownership from renan to pinger by submitting a ticket to unixadmin.
We also commented out all the pingerlod cronjobs for pinger@pinger.slac.stanford.