This page is meant to organize the discussion around the virtualization of Fermi Science Analysis Systems software. Some pieces of software today have a long history and there is a clear lack of man power to have these run on recent platforms. Some software are stuck on RHEL5, other on RHEL6, others run on modern platform. A detailed status point has to be made on each piece to understand the way forward: maintenance, VM, container.
Information on this page were first gathered from a number of reference pages:
- Fermi Software Week Winter 2017

Summary

Here is summary table with the main software packages
- created by Johan on Tuesday 6^th 2017
  
  Name Platforms Dependencies Upgradable? Existing VM Existing container Links Comments Date
  FastCopy RHEL5 ? ? FASTCopy processing chain to be reviewed by experts 06 Jun 2017

- Halfpipe sounds like a candidate..
- GlastRelease is also stuck on RHEL6
- Couple APIs need QT, using commercial version
Need to have a discussion about FastCopy, as it requires RHEL5.
ISOC ops boxes are mostly under RHEL5. Demonstrated that the tools can be run under RHEL6.
Backup ISOC is no longer supported.

What kind of virtualization? VM or container?

GlastRelease:

GlastRelease needs virtualizations
- RHEL 6 is last release that we have the personnel to support
- A few people running GlastRelease (Developers) - nice use case for Docker. Getting GlastRelease to run on your laptop is painful.
- GlastRelease carries around geant4
Is there a distinction between Users and Developers for GlastRelease?
- No

Science Tools:

Focus with ScienceTools is just ease of distribution
Would it be useful to distribute the tools in VMs? Containers? Both?
Joris : I found this VM : Virtual Machine version 3
Are there external dependencies (like xroot-d) that would cause problems with virtualization if backend changes?
We need automated build system for ST: Release manager vs. manual builds
GR uses xrootd ST does not (Eric)
Use of virtualization is for convenience - which is most useful thing to do? (Richard)

Release Manager: Release manager doesn't talk to Oracle - but it does talk to a database. Not user friendly.

For slac farm - docker containers for GlastRelease. Need docker registry
Docker containers is the right solution for batch farm (Brian)
Use their system run to RHEL6 container, but batch host is RHEL7.
- Carefully build container (nice with xrootd)
need to find out from Warren if FT1, FT2 files included (Richard)

What systems need what kinds of containers?

Samuel needed to discuss w/simulations at Lyon. (He is sick today)
What is different for developers/users?
Same image for all the GR uses.
Don't want to pull a 3GB image to pull FT1, GR is 3x bigger. Just have 1 image at the moment.
One giant image - good command line interface installed in that image.
Images built such that the top looks the same between GR and ST. Keep same image.
Separate builds for debugging purposes?
GlastRelease is frozen, ST is constantly evolving. Debugging GR is not a problem, debugging ST is important
Giacomo
- Mount code at runtime, container doesn't have debugging tools.
- Container provides environment.
- Compile inside the container.
- run debugger inside container.
- User image has everything - compiled.
Lightweight container for developers then they can compile. Users have full compiled.
Debugging in GR and ST is very different
The computing center will have a cache of docker.
Every project will say what docker images do you want on the batch nodes?
Plan for managing cashed images. Work out allocations for collaborations.
Cost of using docker?

Pipeline:

Needs someone that he could show the pipeline code and train to do heavy lifting when it comes to kicking the pipeline
Docker containers for something like the batch system may cause some problems, since
For something like the L1 pipeline, a number of images would need to be launched simultaneously
Would size of the software cause problems with deployment?
We would need a system where you restrict loading images to the batch farm to prevent collisions/problems
There is probably a precedent for this, however, Matt has no experience deploying on this scale
File size of ~1 GB is best, a few is manageable for production.
IT dept supportive of docker@SLAC. There is 1 machine with RHEL7
Lyon is a much larger computing center - likely they will upgrade to Docker first
- Now full support for Docker at Lyon (Fred)
  Joris : Lyon wants to use Singularity because they have security issues with UGE + Docker.

Infrastructure:

Last purchase went into dev cluster
- many nodes @RHEL6, upgrade to RHEL7 and doing docker with this
- Still figuring out NFS/AFS sorted out with RHEL7. GPFS?
It's good to come up with a plan because of security implications if NFS underneath.
- Use right docker (UID issues w/security)
SLAC will give us a few nodes for testing docker. Fall back way to install on user machines. (Brian)
- AFS on RHEL6 docker
- read files if world readable.
- NFS is hardest.
Timeline for RHEL7, 12mo? 2018? (Matt)
- RHEL7 support is dodgy.
- Configuration stuff is hard part

Use cases

GlastRelease - frozen on RHEL6
- L1 processing, reprocessing in SLAC batch farm
  - RHEL6 container on a RHEL7 host
  - do FT1, FT2 files go to xrootd? (Warren)
  - separate containers for L1? Maybe not an issue if we can preload batch nodes. We're guessing ~5 GB image.
- Simulations at Lyon, SLAC, GRID
  - maybe the same as for SLAC - check with Samuel for details
- Developers & Users
  - maybe separate versions for debug symbols and source for developers. Could be on-demand production of this version.
- Release Manager or manual builds
Science Tools
- - Caching big files (e.g. templates) in container image. Need a strategy with SCS for this for how containers are cached.

2. Software dependencies :

3. Questions

Joris : Do we have security issues with LSF + Docker (https://developer.ibm.com/storage/2017/01/09/running-ibm-spectrum-lsf-jobs-in-docker-containers/ )
Joris : Compatibility between Singularity and Docker
etc.

Space shortcuts