- This page is meant to organize the discussion around the virtualization of Fermi Science Analysis Systems software. Some pieces of software today have a long history and there is a clear lack of man power to have these run on recent platforms. Some software are stuck on RHEL5, other on RHEL6, others run on modern platform. A detailed status point has to be made on each piece to understand the way forward: maintenance, VM, container.
- Information on this page were first gathered from a number of reference pages:
- SLACK channel on containers
Table of content
Table of Contents |
---|
...
- Many questions to be tackled under this item:
- building vs running
- our software vs 3rd party libraries vs system libraries
- LAT calibrations, connection to mysql db, file transfer via xroot
- Following June 6th 2017 meeting, we'll initially focused on Containers as it's very unlikely that the SLAC farm will move to Virtual Machines (Brian's comment)
- For SLAC farm - docker containers for GlastRelease. Need docker registry
Use their system run to RHEL6 container, but batch host is RHEL7.
- Johan: VMs might still be useful for code development and debugging, for GR in particular.
- indeed, we'll very likely need a RHEL6 VM to build the GR containers at some point (or keep a bare metal node just for building), no?
- indeed, we'll very likely need a RHEL6 VM to build the GR containers at some point (or keep a bare metal node just for building), no?
Virtualization
- Virtual machines running in a cloud environment or on a farm made of virtual nodes
- just a simple VM for developers or end users
Containers
- a few technologies to consider Docker, Shifter (Nersk for HPC) and Singularity
- advantages over a full VM usually are:
- light weight
- better performance
- container has to be carefully built
- Docker: the container
- Shifter: NERSC container optimized for HPC
- Singularity: containers for scientist, compatible with Docker
- Another very interesting link : http://geekyap.blogspot.fr/2016/11/docker-vs-singularity-vs-shifter-in-hpc.html
Tabular Comparison
Name | Docker | Singularity | Shifter |
---|---|---|---|
Main Goal | MicroServices, Enterprise applications | Application portability (one image with all dependencies) | Run Docker containers in HPC environment, Improve Docker security |
UGE compatible | but CC won't use it | ||
LSF compatible | |||
Security | User running docker commands needs to be in special docker group to gain elevated system access LSF improves the Docker security | User runs Singularity image without special privileges. | User run shifter image without special privileges. |
What systems need what kinds of containers?
...
- Needs someone that he could show the pipeline code and train to do heavy lifting when it comes to kicking the pipeline
- Docker containers for something like the batch system may cause some problems, since
- For something like the L1 pipeline, a number of images would need to be launched simultaneously
- Would size of the software cause problems with deployment?
- We would need a system where you restrict loading images to the batch farm to prevent collisions/problems
- There is probably a precedent for this, however, Matt has no experience deploying on this scale
- File size of ~1 GB is best, a few is manageable for production.
- IT dept supportive of docker@SLAC. There is 1 machine with RHEL7
- Lyon is a much larger computing center - likely they will upgrade to Docker first
Now full support for Docker at Lyon (Fred)
Joris : Lyon wants to use Singularity because they have security issues with UGE + Docker.
Infrastructure:
- Last purchase went into dev cluster
- many nodes @RHEL6, upgrade to RHEL7 and doing docker with this
- Still figuring out NFS/AFS sorted out with RHEL7. GPFS?
- It's good to come up with a plan because of security implications if NFS underneath.
- Use right docker (UID issues w/security) SLAC will give us a few nodes for testing docker. Fall back way to install on user machines. (Brian)
- AFS on RHEL6 docker
- read files if world readable. NFS is hardest.
...
- RHEL7 support is dodgy.
- Configuration stuff is hard part
Software dependencies
...
- GPL_TOOLS (staging and logging)
- REPRO common tools
- REPRO task scripts
- GlastRelease
- ScienceTools
- GLAST_EXT software (e.g., python, root)
- Ftools (KIPAC installation)
- ROOT skimmer
- FITS skimmer (possibly unnecessary?)
- evtClassDefs
- calibration and alignment files
- diffuse models
- xroot tools
- xroot /glast/Scratch space
- /scratch on local batch machines
- data catalog query (FT2 file and current version of FITS files)
- mySQL DB (calibration and alignment)
- Fermi astrotools (could probably eliminate)
...
Farm | Node OS | Network FS | VMs | Container |
---|---|---|---|---|
SLAC | RHEL6 | AFS / NFS | No (never says Brian) | Docker |
CC-IN2P3 | RHEL6 and CentOS7 | AFS to be phased out, CVMFS | A OpenStack Cloud is running but not for production | ?Docker full support, but looking into Singularity |
GRID sites | mostly RHEL6, a few CentOS7 | many have CVMFS | no | ? |
- Notes for the SLAC farm
- Last purchase went into dev cluster
- many nodes @RHEL6, upgrade to RHEL7 and doing docker with this
- Still figuring out NFS/AFS sorted out with RHEL7. GPFS?
- It's good to come up with a plan because of security implications if NFS underneath.
- Use right docker (UID issues w/security)
- SLAC has a few nodes for testing docker.
- AFS on RHEL6 docker
- read files if world readable.
- NFS is hardest.
- Timeline for RHEL7, 12mo? 2018? (Matt)
- RHEL7 support is dodgy.
- Configuration stuff is hard part
- Last purchase went into dev cluster
- Notes for CC-IN2P3
- Now full support for Docker at Lyon (Fred)
Joris : Lyon wants to use Singularity because they have security issues with UGE + Docker.
Johan : the CentOS7 queue has Singularity available, will run some tests (for CTA...) soon - July 17th 2017.
...
Questions
- Joris : Is there some security issues with LSF & Docker (https://developer.ibm.com/storage/2017/01/09/running-ibm-spectrum-lsf-jobs-in-docker-containers/ )
- Joris : We need to verify the compatibility between Singularity ( Lyon CC ) and Docker
- etc.
...