...
Infrastructure Maintenance
RHEL5 issues:
Virtualization:
- What needs virtualization?
- Halfpipe sounds like a candidate..
- No, it runs on RHEL6. But unlikely to move beyond. So yes, virtualize at RHEL6.
- GlastRelease is also stuck on RHEL6
- Couple APIs need QT, using commercial version
- Release Manager uses free version of QT
- Unsure why using commercial version.
- Might be worth exploring move to free version
- Need to have a discussion about FastCopy, as it requires RHEL5.
- ISOC ops boxes are mostly under RHEL5. Demonstrated that the tools can be run under RHEL6.
- Backup ISOC is no longer supported.
What kind of virtualization? VM or container?
GlastRelease:
- GlastRelease needs virtualizations
- RHEL 6 is last release that we have the personnel to support
- A few people running GlastRelease (Developers) - nice use case for Docker. Getting GlastRelease to run on your laptop is painful.
- GlastRelease carries around geant4
- Is there a distinction between Users and Developers for GlastRelease?
- No
Science Tools:
- Focus with ScienceTools is just ease of distribution
- Would it be useful to distribute the tools in VMs? Containers? Both?
- Are there external dependencies (like xroot-d) that would cause problems with virtualization if backend changes?
We need automated build system for ST: Release manager vs. manual builds
- GR uses xrootd ST does not (Eric)
- Use of virtualization is for convenience - which is most useful thing to do? (Richard)
- Don't depend on NFS/AFS if build container right. Stable for data xrootd
- getting files/libraries and also output data.
- Container helps with diffuse model
- on nodes not on NSF
- on nodes there's low overhead.
- Caching image on all of the nodes.
- Fermi ST image will have the diffuse model in it.
Release Manager: Release manager doesn't talk to Oracle - but it does talk to a database. Not user friendly.
- For slac farm - docker containers for GlastRelease. Need docker registry
- Docker containers is the right solution for batch farm (Brian)
Use their system run to RHEL6 container, but batch host is RHEL7.
- Carefully build container (nice with xrootd)
- need to find out from Warren if FT1, FT2 files included (Richard)
What systems need what kinds of containers?
- Samuel needed to discuss w/simulations at Lyon. (He is sick today)
- What is different for developers/users?
- Same image for all the GR uses.
- Don't want to pull a 3GB image to pull FT1, GR is 3x bigger. Just have 1 image at the moment.
- One giant image - good command line interface installed in that image.
- Images built such that the top looks the same between GR and ST. Keep same image.
- Separate builds for debugging purposes?
- GlastRelease is frozen, ST is constantly evolving. Debugging GR is not a problem, debugging ST is important
- Giacomo
- Mount code at runtime, container doesn't have debugging tools.
- Container provides environment.
- Compile inside the container.
- run debugger inside container.
- User image has everything - compiled.
- Lightweight container for developers then they can compile. Users have full compiled.
- Debugging in GR and ST is very different
- The computing center will have a cache of docker.
- Every project will say what docker images do you want on the batch nodes?
- Plan for managing cashed images. Work out allocations for collaborations.
- Cost of using docker?
Pipeline:
- Needs someone that he could show the pipeline code and train to do heavy lifting when it comes to kicking the pipeline
- Docker containers for something like the batch system may cause some problems, since
- For something like the L1 pipeline, a number of images would need to be launched simultaneously
- Would size of the software cause problems with deployment?
- We would need a system where you restrict loading images to the batch farm to prevent collisions/problems
- There is probably a precedent for this, however, Matt has no experience deploying on this scale
- File size of ~1 GB is best, a few is manageable for production.
- IT dept supportive of docker@SLAC. There is 1 machine with RHEL7
- Lyon is a much larger computing center - likely they will upgrade to Docker first
- Now full support for Docker at Lyon (Fred)
- Now full support for Docker at Lyon (Fred)
Infrastructure:
- Last purchase went into dev cluster
- many nodes @RHEL6, upgrade to RHEL7 and doing docker with this
- Still figuring out NFS/AFS sorted out with RHEL7. GPFS?
- It's good to come up with a plan because of security implications if NFS underneath.
- Use right docker (UID issues w/security)
- SLAC will give us a few nodes for testing docker. Fall back way to install on user machines. (Brian)
- AFS on RHEL6 docker
- read files if world readable.
- NFS is hardest.
- Timeline for RHEL7, 12mo? 2018? (Matt)
- RHEL7 support is dodgy.
- Configuration stuff is hard part
Flight Software:
- Julie: No path to having anyone other than SLAC supporting flight software
...