Note:  An "appliance" is any prepackaged virtual machine image designed for a specific purpose.  So you can have a MySQL appliance that is an OS+installed version of MySQL that you just fire up and start using the database or an e-mail appliance that is a preconfigured e-mail server, etc. 

Initally this is just some notes to begin with, things will get more fleshed out as we work more on this.

Pros/Cons of running ST in a VM appliance

Pros:

Cons:

Virtualization Software

We'll be using VirtualBox from Oracle as the virtualization software.  Why:

You have to have VirtualBox installed on your system in order to run the VM.  It can be obtained here: http://www.virtualbox.org.  As far as I know (at least for Linux), you must have root access or have an administrator install the software on your system.  I don't believe that it can be installed and run by a normal user (unlike VMWare Player which can be) (This should be checked).

One Possible Usage Scenario

This is the usage scenario I (Tom S) see as the most likely/useful/plausable.  Others may have other ideas and suggestions, please add them.

The ScienceTools Appliance is designed to be a bare bones installation of the OS and a specific version of the Science Tools.  Each version of the appliance only has a single version of the Science Tools installed along with the minimum OS capabilities need to run them.  Thus upgrading to a new version of the Science Tools is a simple as downloading and running a new version of the appliance.

To use the Science Tools installed on the VM, a user would do the following (need to check that all these are acually manual processes)

  1. Download and install the specific version of the appliance desired
  2. Configure the allocated memory, processors, etc as desired if something different from the defaults is desired or needed.
  3. Configure the shared data directory
  4. Start the VM
  5. Log in as the science tool user
  6. Mount the shared data directory
  7. Work using the installed tools from the VM plus any scripts and data in the shared data directory.
  8. Log off and shut down the VM when done.

If the user wanted to work across multiple machines, the virtual machine appliance could be installed on each one and all that would have to be moved are the relevant scripts and data between the systems.

Demo virtual machine

Version 1

I've uploaded a demo virtual machine to u35.  The linux path to the files is /nfs/farm/g/glast/u35/VM/SL5_32-bit-09-24-00.tar.bz2.  It can be accessed via the web at ftp://ftp-glast.slac.stanford.edu/glast.u35/VM/SL5_32-bit-09-24-00.tar.bz2 .  The file is 1.4 GB comressed and expands to about 4.8 GB when uncompressed.  The virtual machine has the following characteristics:

Notes:

Version 2

I've make a new version and uploaded it to u35.  The linux path to the files is */nfs/farm/g/glast/u35/VM/*ST_SL5_64b_09-24-00.tar.bzip2.  It can be accessed via the web at ftp://ftp-glast.slac.stanford.edu/glast.u35/VM/ST_SL5_64b_09-24-00.tar.bzip2 .  The file is 1.3 GB comressed and expands to about 4.6 GB when uncompressed.  The virtual machine has the following characteristics:

Notes:

Try it and and provide feedback.

Configurations

What OS configurations do we want/need?  Personally I (Tom S) propose that we only provide the VM for a single OS, preferably the most recent Redhat OS we are supporting (currently RHEL 5).  However, I (Tom S) don't have access to actual RHEL install media so I'd have to use a clone (namely Scientific Linux) but they are effectively the same thing.

Another related issue is do we provide both 32-bit and 64-bit versions?  In theory, VirtualBox can run 64-bit guest OSes on a 32-bit host OS as long as the underlying architecture is 64 bit but it takes a bit more setup on the part of the end user.  Plus there is bound to be a performance hit.  The question is really, do we need to supply a 32-bit appliance at all?  There are fewer and fewer 32-bit machines out there although there are still a lot of 32-bit versions of the OSes running on the 64-bit architectures.

The next issue is what versions of the ScienceTools do we provide appliances for.  Obviously we will want to provide them for the Release versions.  Is there any reason to provide HEAD and LATEST versions?  Personally, I (Tom S) don't think so but others may disagree. 

If we are only supplying a single OS, only Release builds and the two architecture options (32 & 64 bit), this will be something that can be easily done by hand.  Providing more OSes and especially providing HEAD and LATEST builds would require some investment in time to figure out a way to create the appliances automatically as part of the build process.

Building the ST appliance

The basic plan would be to have a virtual machine with a bare-bones OS installation.  This would get updated as needed for OS patches but for the most part would be fairly static.

When a new build of ST was ready to be deployed, we make a copy of the base OS VM, and in the clone install the new ST version, update anything that needed to be changed and deploy this updated version.

This would give us a consistant starting point for all the ST installations and also save time by not having to reinstall the OS each time we did a new distribution.

Things to be done to create the release image:

1) Clone base OS image

2) Install ST version

3) Add bin directory to path

4) Add CALDB variable to point to the correct place

5) Install extFiles directory in GLAST_EXT