Agenda
- #Requirements for controlling how files are archived
- Archiving small files
- htar vs tar
- db vs file
- mpsstage script
- how to modify for accessing files from archives
- how to handle simultaneous requests for different from from the same archive
- File access control (nfs group vs file)
- Plan of attack and action items
Requirements for controlling how files are archived
- Need to be able to control how files are handled based on folder and file type
- Perhaps via regular expression – although Wilko is worried about performance
- Need to be able to specify:
- File should never be archived
- File should be archived immediately
- File should be archived once it is a certain age
- Need to be able specify minimum size for non-tar archiving
- Need to be able to specify disposition of disk file after archived
- Delete immediately
- Never delete
- Delete once it is a certain age
Somehow we need a mechanism for modifying these rules fairly easily, for example when a MC task is running the default maybe not to archive it, but once the files have been verified then we may want to immediately archive them, and perhaps delete some from disk.
Archive configuration file
Strawman proposal
/glast maxTarSize=500m archive=1d noDelete
/glast/MC noArchive
/glast/MC/task-55 archive=0d
/glast/MC/task-55/.*-recon.root deleteAfterAchive=0d
5 Comments
Tony Johnson
Follow up meeting on February 5th
John Bartelt
I've reserved the SLUO conference room for Feb 12, 10:30-12:00, for another meeting.
Tony Johnson
February 12
Tony Johnson
February 19
Here are some very rough notes from today's meeting.
Present: AndyH (on phone), Lance, Wilko, Tom, John
(1) Wilko reported that xrootd on wain06 crashed. Andy will look at the core file. Wilko also reported some strange messages in dmesg; but said that Lance had previously looked at them and did not think they were a problem.
(2) More zfs conversions: Tom and Lance discussed time and work involved. We should expect more requests will be coming.
(3) htar & pfpt: Andrew may has built htar for solaris/sparc.
Wilko has not had a chance to test it yet. We don't really need it.
Andrew also compile the htar code for solaris/x86, but we don't have the x86 hpss libraries to link with. Should be possible to build from source; but might be a lot of work.
Andy reported that he will redo "stuff" so we can use the standard version of pftp instead of the SLAC-modified version.
(3a) Wilko: plan is to use tar and Oracle DB instead. But for that, need the solaris/x86 Oracle libraries, so can build the python (cx_Oracle) oracle interface. John will work with Ian to get those installed. Hope to have something working by 2/26 (for solaris/sparc?).
(4) More discussion of deleting xrootd files; need more informative return codes? problem with rewriting files (hence unique names). Need more info on use cases. Invite Warren next week?
(5) What else for hpss? Need tapes (there is a pool). Was some hpss feature disabled to speed up something? (Wilko?)
(6) discussion of storage classes, file families, tracking tape usage and reporting.
(7) Tom will check to see if there are top-level file systems that do not need to be ever migrated to hpss. Interest in selective migration/deletion/retention/purging. Wilko will document current situation (in confluence) and maybe suggest what enhancements are feasible.
(8) There will be a copy of the science data on Goddard. Maybe housekeeping too? Is there anything which needs two tape copies at SLAC? (pftp can handle this?)
(9) Lance estimated that there were currently 600 free tapes in the silos. Another 1200 are here and waiting to be added and initialized.
Meet again next week.
John
Tony Johnson
February 26