Proposed behavior

  • Users will specify cuts in gtselect by giving the IRF name, e.g., P7REP_SOURCE_V15, instead of giving evclass:
    gtselect irfs=P7REP_SOURCE_V15
    
  • The bit mask to apply for this choice of irfs is found from CALDB/data/glast/lat/bcf/irfs_index.fits
  • The irfs choice is written to the EVENTS header of the FT1 file written by gtselect as DSS keywords:
    DSTYP3: IRF_VERSION
    DSUNI3: DIMENSIONLESS
    DSVAL3: P7REP_SOURCE_V15
    
    Writing this information as DSS keywords ensures that it will be propagated to downstream data products such as counts maps, gtdiffrsp output files, etc..
  • For all downstream tools, irfs(=INDEF) is a hidden parameter. If unspecified, then the tools will determine the irfs to use from the DSS keywords of the input files.
  • The user can override the irfs for any tool by specifying the irfs option at the command line:
    gtdiffrsp irfs=P7REP_SOURCE_V10
    
    If this choice differs from the one given in the DSS keywords of the input files, then a warning is issued.
  • These changes are in ScienceTools-LATEST-1-3965 (the current LATEST). The only par file that has been changed is gtselect.par in which irfs=INDEF is added as a hidden parameter. In order to enable these changes fully, we need only make irfs(=INDEF) a hidden parameter in all downstream tools and make irfs a required parameter in gtselect. The current LATEST should be compatible with existing scripts.

To Be Resolved

  • How should the irfs be determined initially in gtselect?
    • User supplies full irfs name, e.g., irfs=P7REP_SOURCE_V15. The bit mask is found by look up in irfs_index.fits.
    • User supplies bit mask, e.g., evclass=2, and the most recent set of irfs are inferred from caldb.indx and irfs_index.fits.
    • User is not required to give anything, and the most recent "source" class irfs and event selection is used.
  • No labels

13 Comments

  1. Hi Jim,

    I met with Mike C. and Dave. and discussed this proposal with them.  Here's what they propose. Please let me know if this would work.  It goes without saying that I don't know the inner workings of the Science Tools so I don't know how easy/hard any of this is.

    • Users will specify cuts in gtselect by specifying the event class (evclass=2).  This behavior will remain unchanged from the current behavior. CALDB interaction is not necessary at this point in the analysis.  gtselect writes the event class selection into the DSS keywords as it currently does:
    DSTYP1  = 'BIT_MASK(EVENT_CLASS,4,P7V6)
    'DSUNI1  = 'DIMENSIONLESS
    'DSVAL1  = '1:1'
    
    • When the user gets to a downstream tool that requires an IRF name and IRF location (i.e. gtltcube) the user can either input a specific IRF name (ie. P7REP_SOURCE_V10) or specify CALDB.  The default will be CALDB.   If CALDB is specified the tool would
      • extract the event class from the DSTPY1 keyword value and
      • look up in the irf_index.fits file in CALDB the appropriate IRF to use. 
    • That tool will then write the irf choice into the header if it doesn't already exist.
    DSTYP3: IRF_VERSION
    DSUNI3: DIMENSIONLESS
    DSVAL3: P7REP_SOURCE_V15
    
    • If the irf keywords exist in the input file and do not match the ones returned by CALDB (or one selected by the user), the tool will issue a warning about irf mismatch.  

    Reasoning:* The behavior of gtselect is unchanged and remains agnostic about the irfs (ie. it's only a selection tool).

    • We don't have to add an irf parameter to any tool that doesn't already have one, just make an option be 'CALDB' and make it the default.
    • The irf choice is written into the header at the point of use and can be referenced by tools that need to look at that.
    • The current behavior of downstream tools (gtltcube etc) requires the user to specify the new IRF name if the IRFs are updated, which requires analysis scripts to be re-written.  Allowing the user to specify irf=CALDB means that the irfs can be updated and the analysis scripts re-run without the need for re-writing the scripts.
    1. I can implement this plan, but it has a couple drawbacks in my view compared to my proposal:

      • The user has to know what irfs to use regardless. By specifying the irfs at the gtselect step means that they don't also need to know the event class bit that is associated with that irfs choice. The argument that gtselect is only a "selection tool" sort of misses the point. If it were a generally applicable FTOOL, I would agree, but it is intended as the starting point for all LAT analyses, and no downstream analysis can be performed without the irfs being specified, so why not do so at the outset?
      • I would prefer that the tools look at the existing IRF choice in the input file's DSS keywords and use that as the default rather than looking up in CALDB and retrieving the most recent version (and printing a warning). I think consistency should be more valued than just getting the most recent version of something, especially if it is only warnings that are to be issued, which are likely to be ignored if someone is running a script in batch. This is why I would prefer that irfs=INDEF be the default value, since it doesn't imply that the CALDB will definitely be inspected to find the irfs to be used.
      1. Just for clarification...

        • with respect to point one, can you clarify what the user needs to know about the IRFs?  I think that we want the user to be aware about the source class but the user doesn't need to know what pass (i.e. P7, P7Rep or P8) or what version (V15 etc.) is being used.  A good user will know all of the details but shouldn't need to.  
        • in point two, would you have three options then for the irf function?  'INDEF', 'CALDB' or '<irf_name>'?  Then 'INDEF' would look up in the header, 'CALDB' would look in caldb and '<ifr_name>' would just use the specified IRF?
          • The user will specify in the paper presenting the analysis which irfs they used, e.g., "P7REP_SOURCE_V15". That completely describes the pass version, event bit selection (which may be pass version specific) and irfs version. So I don't think there is any way to avoid requiring the user to be aware of all three of those items if they intend to make their work public. As I noted above, if you wanted to have the evclass=<bit pos> irfs=INDEF option for gtselect in order to maintain backwards compatibility, then the most recent irfs for that event selection and pass ver would be used, written to the DSS keywords and output to the user.
          • I would think just two options are needed: irfs=INDEF would use the existing selection in the input files DSS keywords and irfs=<irf_name> would be used to make explicit that specific irfs are desired. If you want to include an irfs=CALDB option to mean that the most recent set of irfs for that event selection and pass version will be used, that's fine with me, but I still worry about users setting irfs=CALDB reflexively and then not seeing warnings about conflicts if new irfs are made available.
            • Thanks for clarifying the first point.  We're ok with the way this would be handled.  The user should be aware of what IRFs are being used (and be able to find out by looking at the header) but shouldn't have to input it if they don't want to.
            • Having an 'irfs=CALDB' option is desired so that there is a coherent method across missions (ie. this is how other missions do this).  I agree that there is a danger there about users not seeing warnings.  Should that warning be made a fatal error instead?
  2. A few comments from a third party:

    1. By far the most important point is traceability. It is a serious problem that in the current version of the Science Tools it is impossible to know from the file which IRFs version was used to generate an exposure map, for example. This must be solved ABSOLUTELY, and in an explicit way (so that a human user will know at first glance which IRFs was used just from looking at the header). The automatic system that is being discussed is a nice to have, but it is NOT essential.
    2. I agree with Jim that consistency is primordial. If you have used V14 in gtexpcube2 and gtsrcmaps (which define the diffuse sources for gtlike), you don't want to use V15 for point sources (at gtlike level), just because it has suddenly become current.
    3. I do not think it is critical that the user should define the IRFs version explicitly in the first place. It is OK if that is defined implicitly by the Pass, event class and Science Tools version. I support having an automatic system whereby the user does not have to care about this.
    4. I agree with Jeremy that it is more logical to specify only the class in gtselect rather than a full IRF name, since that selection has nothing to do with the version (ie it will be the same for V14 and V15). It is a problem that this class number is not constant (for example at Pass8 Source class is evclass=4, up from evclass=2 in Pass7), but an independent one.
    5. gtltcube is a bad example. That tool does not use the IRFs at all!

    From those "first principles", I favor something like this:

    1. Leave evclass and no irfs parameter in gtselect (ie do not change gtselect at all).
    2. For all Science Tools that currently have an irfs parameter, allow three options: <irf name>, 'CALDB' or 'FOLLOW' (I don't like 'INDEF').
    3. The default would be 'FOLLOW', which uses the previously defined IRFs in the input file headers (must foresee the case when there are several which do not agree, I suggest end in error in that case, or request user input). If there is no previously defined IRFs, also end in error or request user input.
    4. 'CALDB' would read the appropriate version automatically, issuing warnings in case of disagreement with IRFS in input files. Note that at present there is no EVCLASS keyword (neither in the data subspace nor elsewhere) in the files served by the Astro server, so that would not work on those files.
    5. <irf name> would work as before, issuing warnings in case of disagreement with IRFS in input files.
    1. Thanks for the comments Jean and sorry about the gtltcube example.  I think your proposed method is very similar to the one we are moving towards.  I don't really like 'INDEF' either but this is the behavior that most ftools use.  In the tools if you input 'INDEF' it uses the default value for that parameter (either the default in the file or in a par file).  

      1. (Oops, I think I 'removed' your comment by accident Jim.  It was the one where you were asking for a concrete proposal within a week).

        1. Here it is again:

          Jeremy,

          Can you put together a concrete proposal? As I said, I will implement whatever the FSSC wants, but I need a finalized description of the changes before I can proceed. If that can be available within the next week (or sooner) that would be great.

        2. We are happy with our initial reply above (comment-144507568\).  We don't need to implement the 'INDEF' parameter.  Is this good enough or would you like a more formal document?  Happy either way.  

          1. Can you provide the text of the warning message(s) to be issued in the case of a CALDB or user-requested mismatch with the irfs specified in the input file DSS keywords?

            I think the above is sufficient (with the addition of the warning text). If I have questions, I will post and let you know.

            1. How about this:

              IRF version mismatch detected.  IRF version in HEADER: <irf_name>, IRF version provided (by CALDB/on command line): <irf_name>
              
    2. I've been in contact with Brian and Tony about updating the astroserver output to use the correct DSS keywords for Pass 7. They are very busy, but I am hoping this can be fixed before the P7REP release, if not sooner.