Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Infrastructure Maintenance

RHEL5 issues:

 Virtualization: 

     
  • What needs virtualization?
      • Halfpipe sounds like a candidate..
        •  No, it runs on RHEL6. But unlikely to move beyond. So yes, virtualize at RHEL6.
      •  GlastRelease is also stuck on RHEL6
      •  Couple APIs need QT, using commercial version
        •  Release Manager uses free version of QT
        •  Unsure why using commercial version.
        •  Might be worth exploring move to free version
    •  Need to have a discussion about FastCopy, as it requires RHEL5.
    •  ISOC ops boxes are mostly under RHEL5. Demonstrated that the tools can be run under RHEL6.
    •  Backup ISOC is no longer supported.

    What kind of virtualization?  VM or container?

    GlastRelease: 

    •     GlastRelease needs virtualizations
      • RHEL 6 is last release that we have the personnel to support
      • A few people running GlastRelease (Developers) - nice use case for Docker. Getting GlastRelease to run on your laptop is painful. 
      • GlastRelease carries around geant4
    • Is there a distinction between Users and Developers for GlastRelease? 
      • No

     

    Science Tools:

     

    • Focus with ScienceTools is just ease of distribution
    • Would it be useful to distribute the tools in VMs? Containers? Both?
    • Are there external dependencies (like xroot-d) that would cause problems with virtualization if backend changes?
    • We need automated build system for ST: Release manager vs. manual builds 

    • GR uses xrootd ST does not (Eric)
    • Use of virtualization is for convenience - which is most useful thing to do? (Richard)

     

      • Don't depend on NFS/AFS if build container right. Stable for data xrootd
      • getting files/libraries and also output data.  
      • Container helps with diffuse model
        • on nodes not on NSF
        • on nodes there's low overhead. 
        • Caching image on all of the nodes. 
        • Fermi ST image will have the diffuse model in it. 

    Release Manager: Release manager doesn't talk to Oracle - but it does talk to a database. Not user friendly. 

    • For slac farm - docker containers for GlastRelease. Need docker registry
    • Docker containers is the right solution for batch farm (Brian) 
    • Use their system run to RHEL6 container, but batch host is RHEL7.

      • Carefully build container (nice with xrootd)
    • need to find out from Warren if FT1, FT2 files included (Richard)

    What systems need what kinds of containers?

    • Samuel needed to discuss w/simulations at Lyon. (He is sick today)
    •  What is different for developers/users? 
    •  Same image for all the GR uses. 
    •  Don't want to pull a 3GB image to pull FT1, GR is 3x bigger. Just have 1 image at the moment. 
    •  One giant image - good command line interface installed in that image. 
    •  Images built such that the top looks the same between GR and ST. Keep same image. 
    •  Separate builds for debugging purposes? 
    •  GlastRelease is frozen, ST is constantly evolving. Debugging GR is not a problem, debugging ST is important
    • Giacomo
      • Mount code at runtime, container doesn't have debugging tools. 
      • Container provides environment. 
      • Compile inside the container. 
      • run debugger inside container. 
      • User image has everything - compiled. 
    • Lightweight container for developers then they can compile. Users have full compiled. 
    • Debugging in GR and ST is very different
    • The computing center will have a cache of docker. 
    • Every project will say what docker images do you want on the batch nodes? 
    • Plan for managing cashed images. Work out allocations for collaborations. 
    • Cost of using docker? 

     

    Pipeline:

     

    • Needs someone that he could show the pipeline code and train to do heavy lifting when it comes to kicking the pipeline
    • Docker containers for something like the batch system may cause some problems, since
    • For something like the L1 pipeline, a number of images would need to be launched simultaneously
    •  Would size of the software cause problems with deployment?
    • We would need a system where you restrict loading images to the batch farm to prevent collisions/problems
    • There is probably a precedent for this, however, Matt has no experience deploying on this scale 
    • File size of ~1 GB is best, a few is manageable for production. 
    • IT dept supportive of docker@SLAC. There is 1 machine with RHEL7
    • Lyon is a much larger computing center - likely they will upgrade to Docker first
      • Now full support for Docker at Lyon (Fred)

    Infrastructure:

    • Last purchase went into dev cluster
      • many nodes @RHEL6, upgrade to RHEL7 and doing docker with this
      • Still figuring out NFS/AFS sorted out with RHEL7. GPFS? 
    • It's good to come up with a plan because of security implications if NFS underneath. 
      • Use right docker (UID issues w/security)
    • SLAC will give us a few nodes for testing docker. Fall back way to install on user machines. (Brian)
      • AFS on RHEL6 docker
      • read files if world readable. 
      • NFS is hardest. 
    • Timeline for RHEL7, 12mo? 2018? (Matt)
      • RHEL7 support is dodgy. 
      • Configuration stuff is hard part



    Flight Software:

    •  Julie: No path to having anyone other than SLAC supporting flight software

    ...