You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

  • This page is meant to organize the discussion around the virtualization of Fermi Science Analysis Systems software. Some pieces of software today have a long history and there is a clear lack of man power to have these run on recent platforms. Some software are stuck on RHEL5, other on RHEL6, others run on modern platform. A detailed status point has to be made on each piece to understand the way forward: maintenance, VM, container.
  • Information on this page were first gathered from a number of reference pages:

Summary

  • Here is summary table with the main software packages
    • created by Johan on Tuesday 6th 2017

       

      NamePlatformsDependenciesUpgradable?Existing VMExisting containerLinksCommentsDate
      FastCopyRHEL5?(error)?(error)(error)FASTCopy processing chainto be reviewed by experts 
               
               


    • Halfpipe sounds like a candidate..
      •  No, it runs on RHEL6. But unlikely to move beyond. So yes, virtualize at RHEL6.
    •  GlastRelease is also stuck on RHEL6
    •  Couple APIs need QT, using commercial version
      •  Release Manager uses free version of QT
      •  Unsure why using commercial version.
      •  Might be worth exploring move to free version
  •  Need to have a discussion about FastCopy, as it requires RHEL5.
  •  ISOC ops boxes are mostly under RHEL5. Demonstrated that the tools can be run under RHEL6.
  •  Backup ISOC is no longer supported.

What kind of virtualization?  VM or container?

GlastRelease: 

  •     GlastRelease needs virtualizations
    • RHEL 6 is last release that we have the personnel to support
    • A few people running GlastRelease (Developers) - nice use case for Docker. Getting GlastRelease to run on your laptop is painful. 
    • GlastRelease carries around geant4
  • Is there a distinction between Users and Developers for GlastRelease? 
    • No

 

Science Tools:

 

  • Focus with ScienceTools is just ease of distribution
  • Would it be useful to distribute the tools in VMs? Containers? Both?

    Joris : I found this VM : Virtual Machine version 3


  • Are there external dependencies (like xroot-d) that would cause problems with virtualization if backend changes?
  • We need automated build system for ST: Release manager vs. manual builds 

  • GR uses xrootd ST does not (Eric)
  • Use of virtualization is for convenience - which is most useful thing to do? (Richard)

 

    • Don't depend on NFS/AFS if build container right. Stable for data xrootd
    • getting files/libraries and also output data.  
    • Container helps with diffuse model
      • on nodes not on NSF
      • on nodes there's low overhead. 
      • Caching image on all of the nodes. 
      • Fermi ST image will have the diffuse model in it. 

Release Manager: Release manager doesn't talk to Oracle - but it does talk to a database. Not user friendly. 

  • For slac farm - docker containers for GlastRelease. Need docker registry
  • Docker containers is the right solution for batch farm (Brian) 
  • Use their system run to RHEL6 container, but batch host is RHEL7.

    • Carefully build container (nice with xrootd)
  • need to find out from Warren if FT1, FT2 files included (Richard)

What systems need what kinds of containers?

  • Samuel needed to discuss w/simulations at Lyon. (He is sick today)
  •  What is different for developers/users? 
  •  Same image for all the GR uses. 
  •  Don't want to pull a 3GB image to pull FT1, GR is 3x bigger. Just have 1 image at the moment. 
  •  One giant image - good command line interface installed in that image. 
  •  Images built such that the top looks the same between GR and ST. Keep same image. 
  •  Separate builds for debugging purposes? 
  •  GlastRelease is frozen, ST is constantly evolving. Debugging GR is not a problem, debugging ST is important
  • Giacomo
    • Mount code at runtime, container doesn't have debugging tools. 
    • Container provides environment. 
    • Compile inside the container. 
    • run debugger inside container. 
    • User image has everything - compiled. 
  • Lightweight container for developers then they can compile. Users have full compiled. 
  • Debugging in GR and ST is very different
  • The computing center will have a cache of docker. 
  • Every project will say what docker images do you want on the batch nodes? 
  • Plan for managing cashed images. Work out allocations for collaborations. 
  • Cost of using docker? 

 

Pipeline:

 

  • Needs someone that he could show the pipeline code and train to do heavy lifting when it comes to kicking the pipeline
  • Docker containers for something like the batch system may cause some problems, since
  • For something like the L1 pipeline, a number of images would need to be launched simultaneously
  •  Would size of the software cause problems with deployment?
  • We would need a system where you restrict loading images to the batch farm to prevent collisions/problems
  • There is probably a precedent for this, however, Matt has no experience deploying on this scale 
  • File size of ~1 GB is best, a few is manageable for production. 
  • IT dept supportive of docker@SLAC. There is 1 machine with RHEL7
  • Lyon is a much larger computing center - likely they will upgrade to Docker first
    • Now full support for Docker at Lyon (Fred)

      Joris : Lyon wants to use Singularity because they have security issues with UGE + Docker.


Infrastructure:

  • Last purchase went into dev cluster
    • many nodes @RHEL6, upgrade to RHEL7 and doing docker with this
    • Still figuring out NFS/AFS sorted out with RHEL7. GPFS? 
  • It's good to come up with a plan because of security implications if NFS underneath. 
    • Use right docker (UID issues w/security)
  • SLAC will give us a few nodes for testing docker. Fall back way to install on user machines. (Brian)
    • AFS on RHEL6 docker
    • read files if world readable. 
    • NFS is hardest. 
  • Timeline for RHEL7, 12mo? 2018? (Matt)
    • RHEL7 support is dodgy. 
    • Configuration stuff is hard part

Use cases

  • GlastRelease - frozen on RHEL6
    • L1 processing, reprocessing in SLAC batch farm
      • RHEL6 container on a RHEL7 host
      • do FT1, FT2 files go to xrootd? (Warren)
      • separate containers for L1? Maybe not an issue if we can preload batch nodes. We're guessing ~5 GB image.
    • Simulations at Lyon, SLAC, GRID
      • maybe the same as for SLAC - check with Samuel for details
    • Developers & Users
      • maybe separate versions for debug symbols and source for developers. Could be on-demand production of this version.
    • Release Manager or manual builds
  • Science Tools
      • Caching big files (e.g. templates) in container image. Need a strategy with SCS for this for how containers are cached.

2. Software dependencies :

  • GPL_TOOLS (staging and logging)
  • REPRO common tools
  • REPRO task scripts
  • GlastRelease
  • ScienceTools
  • GLAST_EXT software (e.g., python, root)
  • Ftools (KIPAC installation)
  • ROOT skimmer
  • FITS skimmer (possibly unnecessary?)
  • evtClassDefs
  • calibration and alignment files
  • diffuse models
  • xroot tools
  • xroot /glast/Scratch space
  • /scratch on local batch machines
  • data catalog query (FT2 file and current version of FITS files)
  • mySQL DB (calibration and alignment)
  • Fermi astrotools (could probably eliminate)

3. Questions

 



  • No labels