-
Intro
notes about conda that might be useful for other groups to move to conda packaging. Assumes knowledge of conda.
this url: https://confluence.slac.stanford.edu/display/PSDMInternal/Conda+Details
Good reading:
https://www.continuum.io/blog/developer-blog/whats-old-and-new-conda-build
http://technicaldiscovery.blogspot.com/2013/12/why-i-promote-conda.html
Prod & Dev Installations
updating conda is scary
- conda has been rapidly changing
- recent: features for multi-user install
However it is a python program with a number of dependencies.
- Several times a conda update has rendered conda inoperable
- have deleted entire installations
- getting better at repairing: (requests==12 and conda)
Defense:
- two installations prod, dev
- completely separate conda installations.
- User facing is prod.
- Use dev to update conda and test new packages.
- Then build in prod.
Test conda
- part of https://github.com/slaclab/anarel-test
- integration testing for complete environments
- separate than the per package testing that one
- integration testing for complete environments
Still if conda breaks the dev installation, it is a pain.
Todo: automate creating new conda installation
clone our current root environment
update conda, test (from a user account)
RHEL5 RHEL6 RHEL7 Installations
All together, I have six conda installations,
a rhel5/rhel6/rhel7
in both dev and prod.
This is cumbersome,
but no more so than what we did with RPM's before conda.
PIP vs Conda Packaging
Part of anaconda's success is the fact that pip works.
However another part is conda packaging is better
as discussed in conda blog linked above
it handles dependency tracking better,
makes building a big software stack with numpy, etc more robust.
For the production, multi-user conda environments that we maintain,
prefer no pip anything in them. Issues:
- Might end up with two of something,
- for example numpy, one from a pip package dependencies,
another from conda package
- for example numpy, one from a pip package dependencies,
- things like cloning an environment, etc, might not work,
conda can't track pip as well as its own package
Build Recipes
Kind of what you get from defaults:
https://github.com/conda/conda-recipes
conda-forge:
https://github.com/conda-forge/feedstocks/tree/master/feedstocks
for example boost:
https://github.com/conda-forge/boost-feedstock/tree/master/recipe
My recipes https://github.com/slaclab/anarel-manage/tree/master/recipes
Tips
For Python,
- your build.sh will probably run a packages setup.py,
or use pip to get a wheel file. - Don't do easy install,
.pth files that trigger site.py sys.path manipulation don't work well,
in fact conda-build may now check to make sure you don't do that. - Likewise, no egg files, see google group egg packages
- When you use pip, you don't want it to trigger installing package dependencies like numpy
- use pip install --no-deps
like in this example (tensorflow build script, use pip to get pkg from a wheel file on the internet)
- use pip install --no-deps
With setup.py, want
- python setup.py install --single-version-externally-managed --record=record.txt
like in this example (keras, no build.sh, simple enough install you can do from meta.yaml)- this puts the package in site-packages in a straightforward way,
i,e, you get a subdirectory to site-packages that is your python module to import
- this puts the package in site-packages in a straightforward way,
Boost Issues/Cross Platform Issues
boost gets complicated,
you compile it under gcc v4
and it might create names different than gcc v7.
If you deliver a package that contains libraries for developers with v4 names
and somone compiles with v7,
they might get undefined symbols -
the v7 developer may need special switches to compiler your boost header files.
Organizing
right now I am organizing recipes
https://github.com/slaclab/anarel-manage/tree/master/recipes
into
system - packages only to be built as LCLS, depends on system things like LSF installed at SLAC, and packages that depend on these
external - generally available packages, say we need to build our own version of guppy
psana - things related to our code
better to keep them all in one place?
Templatizing
Until recently, I had many recipe directories, for different versions and build variants, i.e,
hdf5-1.8.15-prod
hdf5-1.8.17-dbg
however gets cumbersome.
see whats-old-and-new-conda-build for more on templatizing recipe's.
{% set version = 1.10.5 %}
...
source:
fn: openmpi-{{version}}
COULD DO: develop release system to create all build variants and packages from one recipe file, use jinja2 conda build features to expand template
Conda Recipes
where to put them
They really should be part of the software.
Great example: https://github.com/paulscherrerinstitute/cbf/tree/master/conda-recipe
let me figure out issue with installing cbf - how they built against numpy had changed
But many recipe is external, wrappers
Psana Recipe/Home Grown Install
many examples do thinks like
make PREFIX=path/to/conda/prefix
python setup.py PREFIX=conda location, etc
However psana installs in its own directory structure for RPM release system
Not so simple to use existing install target
Did the following
- recipe build.sh calls new target I added to SConsTools build system (home grown, like a Makefile, implements build logic for scons)
- That code copies built files to conda environment locations, i.e
- arch/x86-gccxx/bin/* --> $CONDA_PREFIX/bin/*
- details: conda_install.py
- note - you can create new subdirectories to conda, i.e, data/web subdirs
- don't have to put everything in bin/lib/include
Packages
Moving them Around
conda-build defaults to put output in a subdir of the central install.
Makes it easy to put it in anaconda cloud.
I put it in a local file channel.
Later in anaconda channels (like https://anaconda.org/lcls-rhel7)
Part of why I wrote ana-rel-admin. I do
ana-rel-admin --cmd pkg-build --recipe path/to/repicpe
and it does a number of the plumbing steps
- maintain package log files
- copy to file channel
- updates index of the file channel
Channels
/reg/g/psdm/sw/conda/channels
then separate dirs for system, external, psana
What we need to Build
openmpi to get LSF support
hdf5 to get parallel support
h5py to get parallel support
mpi4py and tables – depend on above
Package Precedence
channel order ensures that hdf5 comes from our channel
channel precedence new feature
used to be version number
people used high build numbers to force conda to use their package
(I still set build number to 101 in some recipes)
Package Naming
I have been doing things like
hdf5-1.8.17-openmpi_101.tar.bz2
hdf5-1.8.17-openmpi_dbg_101.tar.bz2
to name build variants. However then when you do
conda install hdf5
conda doesn't know what to pick. A user has to be specific, something like
conda install hdf5=1.8.17=openmpi_dbg_101
You can use features. Example for gpu
- create gpu feature package
- Track gpu feature in other packages meta.yaml
- this is for tensorflow
Can also put it in the package name, ie
hdf5_dbg==1.8.17
use the recipe conflicts with to make sure not installed with hdf5
tensorflow now doing this for their gpu build, ie, you can pip install
tensorflow
tensorflow_gpu
names like this could be a better solution for multi-host compiling.
Build Matrix
You'll find packages like matplotlib built many times, i.e, against many python versions and many numpy versions.
conda-build has options to specify python/numpy at the command line.
With psana, I'm building with very specifc versions, but for running, trying to make it more flexible - but I haven't tested with different versions.
Conda Build Debugging
building openmpi has taken 40 minutes
very frustrating when it fails.
Recent failure, my test section had a typo, instead of (in openmpi meta.yaml)
test: commands: command -v ompi_info
I had
test: commands: command -v ompi_infox
Really not clear from my log what happened (maybe my anarel-admin wrapper gets in the way) from the log what happened:
TEST START: openmpi-2.0.1-lsf_verbs_1 Deleting work directory, /reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/openmpi-2_1484019610790/work/openmpi-2.0.1 The following packages will be downloaded: package | build ---------------------------|----------------- openmpi-2.0.1 | lsf_verbs_1 3.0 MB local The following NEW packages will be INSTALLED: openmpi: 2.0.1-lsf_verbs_1 local TESTS FAILED: openmpi-2.0.1-lsf_verbs_1
Run conda-build -h, there is a --test flag, but doesn't work at first.
conda build's work is in /path/to/install/conda-bld
if you poke around, you'll see a subdir broken
you can
- copy broken/openmpi-2.0.1_lsf_verbs_.tar.bz2 into linux-64
- cd linux-64
- conda index # update index
- conda-build --test path/to/recipe
When it runs, you'll see things like
(manage) (psreldev) psel701: /reg/g/psdm/sw/conda/manage/recipes/system $ conda build -t openmpi-2 TEST START: openmpi-2.0.1-lsf_verbs_1 Deleting work directory, /reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/openmpi-2_1484022219868/work updating index in: /reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/linux-64 updating index in: /reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/noarch The following packages will be downloaded: package | build ---------------------------|----------------- openmpi-2.0.1 | lsf_verbs_1 3.0 MB file:///reg/g/psdm/sw/conda/channels/system-rhel7 The following NEW packages will be INSTALLED: openmpi: 2.0.1-lsf_verbs_1 file:///reg/g/psdm/sw/conda/channels/system-rhel7 + source /reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/bin/activate /reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/openmpi-2_1484022219868/_t_env + /bin/bash -x -e /reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/openmpi-2_1484022219868/test_tmp/run_test.sh + ls conda_test_runner.sh helloworld.c helloworld.cxx run_test.py run_test.sh + pwd /reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/openmpi-2_1484022219868/test_tmp + command -v ompi_infox TESTS FAILED: openmpi-2.0.1-lsf_verbs_1
This is better, so somethings we see
- conda build creates these temporary testing (and build environments) in unique directory names
- you can activate a environment by giving a path, not just a name
- conda-build creates a file run_test.sh from your meta.yaml that you can rerun
- it is hard to find where these things are
When you get into debugging the build step, you'll find the build happening in long/long directory names, with placeholder repeated as much as possible - this is to make as much room as possible for subsequent changing of rpath
debugging conda build, cumbersome - but easier then debugging rpm builds
Package Dependencies
- specify package to build against
- more leniant package description to run against
- using a patch
- specifying relocation
Release Management
conda has the tools for release management,
- python/C++ packaging
- flexible package dependency tracking
- environments, user and centrally installed multi-user
But I would not call it a release system. Things I've implemented
- local file channels for our packages
- keeping logs of builds
- tool to rebuild all local packages
- configuration for what our releases look like, i.e
- ana-1.0.x ...
- ana-1.0.x-py3 ..
- what packages are in these releases
- parameterize build/releases based on rhel5/6 etc
- more flexible creation of conda environments
- scheme to keep track of 'current' release
- keep old environments built to easily allow people to go back
Managing Package Upgrades - Don't break users code
Central Install
Following the way things have been done in the past, we maintain central multi-user installations of the analysis software.
At any time there will be conda environments with names:
ana-1.0.1
ana-1.0.2
etc,
when we upgrade a package, we create a new environment first.
User Management
Alternative, we just provide channels with packages.
Users make their own conda environments, or just update the root environment.
Ideally, users just do
conda update psana
to keep up with our software (if they want/need to)
or they create an environment from a file we provide, something like:
conda create --name ana --file environment-file
This has a lot of advantages, but maybe not quite seemless enough to put on users.
One issue - psana depends on environment variables like SIT_ROOT.
ana-rel-admin
The repo: https://github.com/slaclab/anarel-manage
Has all my release management code.
gives all the commands. The anarel-manage repo has configuration like:
- all the packages we make use of
- what packages comprise our releases
Then implements tools to
- build a package with logging, move it around (described above)
- build all the packages (from config file)
- index all the channels
- update all six .condarc's from a template
- build releases
- automate the building/testing of all ana releases, i.e:
- variants (p27/py3/gpu)
- hosts(rhel5/rhel6/rhel7)
- prod/dev installs
- integration testing from user account
- This is the kind of thing that is probably better done in buildbot or travis, but I rolled my own.
Building an Environment
This should be as simple as cloning the old one and updating some packages, or maintaining an environment file that we build from, however
- I ran into issues with environment files and local channels
- Rules for channel precedence changed during development
- I had a lot of trouble getting the precise package versions/builds that I wanted
- Issues with numpy, mkl packages were clobbering certain files
- several packages had bugs with permissions, admin account could read but not users
Environment Defense
To deal with those issues, I build the environment in stages, config file is here: (yaml file):
https://github.com/slaclab/anarel-manage/blob/master/config/anarel.yaml
Building in stages lets me
- checkpoint environments to make sure numpy is still working after a new package
- get finer control over what channels packages come from
Problem
- later stages can undo previous
- could solve by pinning
- or do one stage
also of interest maybe, the .condarc
https://github.com/slaclab/anarel-manage/blob/master/config/condarc_template
key thing is getting our channels in the search path.
Staying up to Date with External Packages
The packages in the rpm release system are old.
Updating them is cumbersome
There is no integration testing of them
Backing out a new version that broke something is expensive
In the anrel.yaml, you'll notice I specify 'latest' for many packages, but for others they are pinned.
Whenever we build a new set of environments, not quite sure what versions we'll pick up, or if they will all work together.
For instance, anarel.yaml - pydateutil line
It wasn't until my testing step that I found 2.6.0 of pydateutil broke a unit test in pandas
Relatively easy to clean up
see testing below
Channel Precedence
I'd like to use conda-forge for everything, but numpy from conda-forge doesn't work on rhel5: github condaforge numpy issue
However numpy works from defaults, so my condarc's list defaults, then conda-forge
Testing
here is the repo
https://github.com/slaclab/anarel-test
- test from user account, not admin account
- verify I can run things like
- psana -h
for many packages
- psana -h
- import many packages
- check that I can load many .so files
- Use nose to run tests for many pakcages
- scipy
- numpy
- pandas
- Run psana tests
- Run separate tests for
- h5py
- conda
- hdf5
- mpi4py
- openmpi
- conda tests are very important
Features
- uses paramiko to run processes on rhel5,rhel6,rhel7 machines
- inputs credentials for a tester account
- builds environments in dev first, tests
- then builds environments in prod, tests again
- includes integration tests mentioned above
- reports on package updates
- tester writes files to world writable central directory
- emails when done
HTML Report
https://pswww.slac.stanford.edu/user/psreldev/builds/auto-1.1.0/
Development Environments
One Instance of Package
- make new environment
- python:
- conda develop path/to/pkg/with/setup.py
- creates pkg.pth in conda env site-packages
- can uninstall with same tool
- similar to pip --develop or pip --editable
- conda develop path/to/pkg/with/setup.py
- For C++. not sure
- make with install to conda?
- or soft links from conda to your build?
Two Instances of Package
- Python
- Use PYTHONPATH to get development package
- What we do for psana python packages
- more awkward for external packages like scikit-beam (need to install it somewhere)
- Important:
- clean conda env with conda packages
- no pip/easy_install that did site.py sys.path manip
- clean conda env with conda packages
- Use PYTHONPATH to get development package
- C++
- Use LD_LIBRARY_PATH and PATH
- Issue: RUNPATH vs. RPATH
- RPATH doesn't look at LD_LIBRARY_PATH, RUNPATH does
- Everything in conda that uses your C++ needs to be built with RUNPATH
- -Wl,--enable-new-dtags
- Issue: manipulating PATH takes some care
- activating and deactivating conda environments manipulates PATH