We should organise the folders in the catalogue to avoid total chaos. Here is a straw proposal for a top level layout:

 MC

   DC2
   Service Challenge
      Gleam
     Obssim
  Pass5 (question)
   BeamtTest
  User
  Test

 Data

   ETE
     1
     2 etc
   Flight
   I&T
   Test

Questions

  • how do we specify data reprocessings? Or MC for that matter.
  • No labels

4 Comments

  1. From Tom:

    Seems fine to me.  The current scheme organizes data under "Service Challenge" by task name, something which makes sense
    to me but perhaps not to Joe User.  Can there not also be aliases?  If so, then we could have a parallel scheme, one
    which was meaningful to pipeline task operators, and another by project and/or physics category intended for the
    collaboration at large. For example, there will likely be many more ServiceChallenge sponsored tasks than most users
    will ever be interested in, e.g., old v9r20 data.  I would like to retain the task name as part of the organization if
    possible - and am not certain the "Gleam" - "Obssim" separation would be needed in that view.

       - Tom

  2.  From Anders:

         Hmmm .... I'm not sure we want reprocessings at the (near) top
    level. It would more be suitable at the run/file level.

         In general I prefer to minimize the number of top levels, but
    maybe we should split LCI and Physics i.e. Fligh/LCI, Fligh/Physics (and
    maybe add a Flight/LEO)?

    anders

  3. I have a few comments on the  folders in MC. I think that service challenge may not be a good identifier (because it will cut across a few versions of the CT analysis). Maybe something like:

     MC
       LAT (in orbit)
          pass3/DC2
               AllGamma
               Bkg
               Sky
                      Gleam
                      Obssim
               ...
          pass4/handoff
               AllGamma
               Bkg
               Sky
                    Gleam
                    Obssim
               ...
          pass5/SC/final?
               AllGamma
               Bkg
               Sky
                   Gleam
                   Obssim
               ...
          BeamTest
          I&T (if we still have some)
          User
          Test

    The disadvantage of this is that a given task will move from pass4 to pass5 when we reprocess with the new classification trees. It would also give us obsolete branches (pass3/DC2 and pass4/handoff) at the top level which would not be great. However, this has the advantage of clearly collecting together datasets which have a high chance of playing well together (Service challenge would otherwise be a mix of pass4 and pass5). Pass3/pass4/pass5 all correspond to a labeled set of IRFs so make sense for collecting obssim runs too.

    I don't have any suggestions at the moment for the DATA tree. 

  4. In general I think we should try to arrange folders from a user-centric point of view, i.e. making sure it is obvious to the user which top level folder the data they are looking for will be in is probably more important than making it obvious to the data producer. So items like Service Challenge/DC2, Gleam/Obssim all seem to make sense. (Or perhaps something like full simulation/fast simulation – will end users necessarily know what gleam is?) For MC running task name is an obvious component of the path name, although I would think it should come below the Gleam/Obssim level (and thus may be repeated). For real data I doubt that task name will be an important part of the path.

    We are developing tools which should make it easy to move folders around, especially at the top-level, so the organization does not have to be fixed for all time. So right now ETEn is important and should live near the top, but in a years time it can be moved further down. It is not possible right now to have the same file/dataset appear in two different places in the tree, but I have been discussing with Dan how to add this, so I think it will appear later.