Wednesday

Action Items

  • After Brian's discussion with Andy H., it sounds like we have an easy path to move to xroot and eliminate the archiver. We need to contact Wilko or Lance to discuss how

Morning Status

Elizabeth and Joe E will be meeting with Rob over the next 3 afternoons for discussions.

Elizabeth, Don, and friends will continue pouring over Steve Tether's Confluence pages to extract a list of software and their locations.

Brian is busily updating repoman.  Later today, he could update the Docker container to use the latest repoman and build GR or ST.

Joanne will be looking at updating the existing GH repo to support SCons builds.

Arash has been in touch with Joanne concerning the dev RM DB.  The current DB contains 50 GB.  Arash has asked if there is anything that can be pruned.  Joanne is looking into that.

Tom G and Nicola had a fruitful meeting covering the reprocessing yesterday.

Alex, Joe A, and Giacomo are continuing to work on conda builds and testing.

A new version of repoman has been released and installed at SLAC on RH6 under $GLAST_EXT/repoman

repoman checkout --in-place ScienceTools-scons repoman_test_2

Developer Workflow

Long discussion concerning the developer workflow with many ideas bandied about including a monolithic repo (SuperRepo).  Below is the white board Brian drew up which outlined development along branches in multiple subpackages (not all of which involved the same developer or purpose), PRs, triggering builds via github hooks, and creating of a new meta package (ST or GR) release.

Brian showed us how we could perform releases via the Jenkins web interface.

After much discussion it appears we agreed to continue down the path we discussed in Feb and earlier this week, where we will have subpackages in GitHub, and we will set up Jenkins to handle builds, reusing bits of RM as needed.

We have agreed that the build log parser logic will be pulled out of RM and added to Jenkins, so that we can continue to pull out the most pertinent parts of the extensive build log which includes output for all packages not just the one subpackage a developer might be interested in.

Alternate workflow proposal: GlastTools SuperRepo

An alternate proposal was discussed and did not receive consensus. In this model all package repositories would be collapsed into 1 git repository containing the union of ScienceTools and GlastRelease packages. The two products would be considered different build targets for the build system. SCons would continue to be responsible for building the requisite packages for each target, and differentiating between external dependency versions. This would greatly simplify the Continuous Integration model for Jenkins.

Developers raised objections that moving to this model would be too disruptive to their current workflow, one based on CVS tags shared between distinct repositories.

 

GlastRelease

Matt asked who is going to be taking over GlastRelease.  That is apparently Tom Stephens. Heather showed Tom the version that L1 is currently using.

We briefly discussed handing of GR externals.  Heather will pursue using conda, leveraging off the extensive work already done by Giacomo and Alex.  Giacomo showed Heather their repo, and she'll fork and make some additions for GR and we'll go from there.

Tuesday

Action Items from Tuesday

  • Brian to update repoman to provide an init step to detect if a user has SSH credentials available for GitHub, will revert to HTTP otherwise.
  • Brian may consider making repoman a conda package
  • Richard will contact the Computing Center to determine the fate of astore and recommended mitigation plans for data access.
  • Steve may contact Andrew May about "store"
  • Don Horner will become the new resident archiver and xroot backup expert and work to craft a replacement
  • Elizabeth and company will be writing up a list of ISOC software and their locations in Confluence.

Alex's update from yesterday's afternoon session with Giacomo and Joe A. concerning conda builds for Mac.
Alex was working on building the external dependencies for the Mac.
Giacomo worked on getting Travis-CI builds working.
Joe A worked on getting the tests and environment variable setup fixed.
Once the fermi-lat GH area is set up, they will point to the new repo and use that so that we are all sharing the same code base. 

Tom & Nicola meeting to discuss reprocessing.

Warren, Don, Joe E, Tom S, Elizabeth meeting to discuss calibrations.

Random side conversations concerning the upcoming FSSC ST release and validation.  Matt reminds the FSSC folks of the need for pyLikelihood tests beyond making sure that it imports correctly into Python.  Further validation and unit test upgrade discussion awaits Elizabeth's involvement.
Jeremy is on the hook to convert the FSSC doc to Markdown or rST. 

Repoman

Installed repoman on RHEL6-64 under $GLAST_EXT/repoman/miniconda3 - use by pointing your path to $GLAST_EXT/repoman/miniconda3/bin

pip install --pre -U fermi-repoman

 

GitHub access via SSH is the default connection method for repoman, if using SSH, your repoman checkout command must be modified.

SSH GitHub access
repoman checkout GlastRelease-scons repoman_test
repoman checkout ScienceTools-scons repoman_test
HTTP GitHub
repoman --remote-base https://github.com/fermi-lat checkout GlastRelease-scons repoman_test

 

Repoman checks out tags of subpackages based on the packageList.txt. For development, one would use standard git commands to update to master, create a branch, make changes and push into the repo.  Joanne will be using that mechanism to make her changes to repo to allow SCons building.

A new tag must be applied to all of celestialSources.

Joanne reminded use that for container packages like celestialSource and irfs, the SConscript files reside in each of the sub-sub-packages and we may need to modify the RM build mechanism to accommodate that structure.

Brian has added in an "auto-bump" mechanism to implement tagging for version, minor, and patches, similar to what we had in place for stag.

The Steve Tether Show

ISOC software summary - Steve's summary from Feb 2017

astore reference

Archiver

While in the past it performed many functions, the archiver's remaining take is to move Raw data + FastCopy data to tape via "astore". Due to losing the ability to obtain an AFS token, much of the archiver's functionality has been lost, including the ability to clean up data on wain031 that has already been backed up.  Steve now does this manually.  The erase occurs when both files "astoredone" and "xrootdone" are created, meaning that both backups have been performed and the age of the raw data is sufficient.

on wain031:   Note that there is a replacement disk, and there is a slow copy of this area occurring now.
isoc-flight
  ---> u23
         ----> archive
                 ------> L0
                 ------->fastCopy 

Steve's opinion is that we should replace the archiver rather than attempt to fix it up.

There are dual backups of the LAT Raw data to both xrootd (via L0xrootd) and "astore" which is an interface to HPSS. "astore" also handed FastCopy archives which while small (1/50th) all the data in rooted, includes packages we receive, Scientific data, housekeeping, mission planning, and data that the ISOC sends.

"astore" is deprecated, Computing Center is planning to discontinue its use at some undetermined time in the future. Brian suggests we may have 2 years, but that needs to be confirmed by the Computing Center.  Andrew May is the current "astore" maintainer. Richard also mentions that Lance is very helpful.

  1. Deal with old data
  2. Provide alternate "astore" capability for future data.

The idea was floated to migrate all the old data, but that would likely require hundreds of hours of I/O.

Richard confirms that we have plenty of tape available.

Steve owns the L0 task.

Rob asks about the DTR task contained in  the fastCopy archive. Steve indicates that no changes are required for the DTR task if astore is replaced.  "astore" only handles back ups.

CHS package handles unpacked LAT raw data, lsf an ldd formats.  This is used heavily by the half-pipe which extracts evt files and passes them to L1.

Issues on some RH6 machines have been noted.

Don Horner was named as the lucky person tasked to learn about the current archiver and craft a replacement.

Monday

Attendees: Joe E, Joe A, Tom S, Regina, Giacomo, Jermey P, Rob C, Tom S, Elizabeth F, Don H, Alex, Joanne, Richard, Tom G, Heather, Brian, Matt W, Eric (afternoon), Jim C 

Action Items from Monday

  • We will migrate to GitHub without the use of git tools like subrepo as a first step.  Brian will introduce a fresh copy of ST and GR.
  • Joanne will test introducing the --rpath flag in the SCons builds.
  • At Jim's request, Heather will contact Johan about his needs for ST installations and development.
  • Giacomo and Alex will pursue using condo-forge for their conda builds.
  • Giacomo and Alex will contact the maintainers of the current condo-forge cfitsio channel.

 

Topics overview from Richard

Package Manager:  RM vs Repoman and what are our developer methodology for the next 10 years.  Get the biggest bang for the buck for our efforts.

Need to merge the efforts of the FSSC and SLAC.  The conda distributions may be part of the release management process as one of the build products.

May have limited conversations about containers this week.

Mac OS support - how do we manage that?

 

Start mind-meld with Warren about L1 processing.  Rob will be talking to Joe E and Elizabeth about mission planning and more!

Steve T and the plans for the ISOC - he's hoping not to move past RHEL6.  The archiver, written in Perl, is a complete black box.  Troy Porter and company may look into a rewrite.

Review the skimmer and what we are willing (or not willing to do) to support it.  Command-line and web interfaces.  TSkim and L1Processing interfaces.  Might contact David Chamont. 

Tom G and Nicola are our resident reprocessing experts.

Brian presents a StrawMan Release Workflow

         

  • commit/tag 
    • Release  = commit + tag
    • Travis or Jenkins driven 
      • sets ENV vars
  • INIT WS
    • Use reporman to init workspace
      • workspace only works currently for ST and GR due to need for packagelist.  Could use packageLib.py to determine dependencies and requirements to allow per package builds.
    • If including extlibs into containers, every part of this setup would be part of the containers.
  • Code Generation and DOxygen
    • Code Generation (SWIG) but that is done in SCons
    • Really just doxgygen for now
  • Compile
    • In Jenkins or Travis this would do a loop over available systems to build and variants
      • CentOS6, Mac, CentOS7 - this info comes from?  which is currently in RM DB.. Brian would like to put that in the repository, due to a desire to avoid putting those credentials into Travis. Could have additional matrix for the compilers to use.
    • SCons builds the unit tests too
  • TEST
  • Package
    • package env that the build used, including dependencies and externals, which can ultimately be pulled into a container
    • SCOns currently create the tarball and the setup scripts
  • Validate
    • We don't currently have any system tests for ST, we do have some for GR which are not currently run as part of RM, it would be nice to reactivate for this for releases.
  • Deploy
    • Moving tar balls from workspace into NFS to CVMFS
    • Containers (Singularity) will link to CVMFS and includes xrootd installed
    • Could deploy to conda too

Giacomo reminds that the conda build system includes most of the above.  Are we going to consider moving everything to the conda system or in parallel?

Richard:  who are the recipients of the builds? processing, developers, collar users..

conda - is agnostic to what you use for the build (hmake, scons, etc)

Giacomo mentions that conda builds requires some things like the to use rpath, need to compile,   Joanne volunteers to try to use rpath for a build.

Brian:  conda could be another variant in the COMPILE box.

Brian: Release process creates the tags, by the user or via something like Jenkins.. provide package name and the tagged version that you want.

unit test - results

build logs - Jenkins stores the output.  Those familiar with the RM wonder about extracting the errors for developers to more easily view build problems, as we already do for the RMII web pages.

Matt: Are we completely replacing the RM?  
Brian: replacement of RM ultimately, but re-use RM for the beginning phases.

Joanne:  RM is tag driven

Alex: Travis is commit driven
$130 for the unlimited time frame for an indeterminate period of time
Modularization will also help.

Brian: WS is semi-persistent - if doing a repoman checkout, if only one sub package changes, then that's the only one checked out and SCons is smart enough to only build what is necessary.

 

Monday Afternoon

Discussion about git tools such a subrepo, submodule, etc occurred during the morning (Someone could insert their recollections).  We discussed it further in the afternoon, where Joanne mentioned that subrepo handles some operations nicely, particularly when branching is utilized.  We ultimately agreed that subrepo could be added later if we find it useful.  That way, setting up GitHub could proceed immediately.  Next step will be to use repoman to checkout the code, test a build using SCons and then feed the RM to start up a build.

Brian has recently gone through the ST and GR dependencies and created 65-70 repos.  This includes some redundant copies of some repos for package that are shared between GR and ST.  Will push the repos but notes that he needs to prune large files.  

Conda Discussion with Giacomo and Alex

https://github.com/giacomov/conda-fermi-externals/blob/master/notes/How%20conda%20works.ipynb

Giacomo went over the finer points of setting up conda builds for the externals.  See the detailed notes above. Matt asked about using condo-force instead which may be a better long term solution as they handle the builds for us.  It was noted that condo-forge may be using a later version of gcc and there is a cfitsio channel already in place. Matt suggests contacting the cfitsio developers and trying out condo-forge.

Testing the conda binaries of ST is a work in process, with many issues revolving around the initialization script that sets up the environment.

Discussed potential time limit issues with building ST via Travis-CI.  Without externals, Joe A estimates it takes 30 minutes which is under the Travis time limit (~40min?)

Mention of Travis led to a discussion of Mac OS support.  Questions swirled about how to support Mac developers. Eric notes that he has all but given up on Mac support. Jeremy wondered if one could use the condo-build mechanism to support development on the Mac.  It sounds like the build.sh could be reused locally, with tweaks to remove the installation step. Matt suggests introducing additional compiler versions in the CI may help identify many of the problems discovered in the Mac builds by the FSSC before releases are made by the LAT. Jim suggests adding clang as one of the options in the CI.

Alex and Giacomo ran off to work on support Mac in the condo-builds.

 

 

  • No labels