Definition of the Contents of the LAT Source Catalog
Note: This page was written in March 2005 and extensively re-arranged in October 2005
The current draft definition of the contents of the source catalog is in the draft of the Science Data Products Interface Control Document, available in Word and (in a document prepared for the review of the plans for the science tools in 2002) in HTML.
The draft is in need of updating in several respects. Some of the issues were discussed during a splinter session of the LAT collaboration meeting in September 2004. This presentation also includes the high-level requirements for producing the catalog.
This confluence page presents a summary of the issues and some suggested resolutions. Broadly speaking the issues are related either to what should be included in the catalog or how best to represent the included items in FITS. The issues and proposed resolutions, or any other aspect of the definition of the contents are open for comment (probably most effectively by editing this page rather than inserting Confluence comments).
Issues
Text in red shows what was adopted for the revised draft presented further below.
Jump to the comments added in addition to the issues.
1. Source_Name
(Ballet) I do not believe it is appropriate to define TLMIN/MAX for a string column (or is it ?). In any case, most likely we will number the versions of our catalog as everybody does, so that will be 1GL, 2GL, ... Are those registered at the IAU already ?
(Digel) I agree. The LAT catalog designation is not registered yet. I bet that the SSAC or SWG would like to have the responsibility of picking the letters to use, although there are not very many obvious choices. GL looks good to me. I imagine that GBM sources mostly will be given GRB ######-type names.
1GL J123456-012345 naming scheme has been assumed.
2. Conf_68_Region and Conf_95_Region
(Ballet) I think we had better split those fields into individual scalar columns. This is much easier to use in searches with standard tools.
The fields have been split.
3. Spectral_Index
(Ballet) We should add an uncertainty (1 sigma) to it.
(Digel) I agree.
Done.
4. Energy bands
(Ballet) I suggest we add the source count rates or fluxes (plus uncertainty) in several broad energy bands. This can be very useful when looking for unusual spectra (very different from a power law). I imagine the way to get that would be to run likelihood after event selection on energy in each energy band, then converting the prefactor value to a flux in that band for the spectral index of the source. Running likelihood directly on the image accumulated in that band would be another option, but this unfortunately requires an assumption on the spectral index as well (to build the PSF).
A simple choice of bands would be logarithmically spaced, with boundaries at log(EMeV) = 1.5, 2, 2.5, 3, 3.5 and maximum.
(Digel) This sounds sensible to me, although I'd prefer to use integral fluxes.
I stuck with integral fluxes, for >100 MeV, >300 MeV, >1 GeV, and >3 GeV
energy ranges, with uncertainties included for each flux. This will still be useful for finding unusual spectra.
(Ballet) The advantage I see in band-limited fluxes is that they are approximately independent.
In addition, your scheme introduces a confusion in the user's mind, because the main flux > 100 MeV that we provide (FLUX100) is NOT directly comparable to those integral fluxes. It is NOT obtained by selecting photons above 100 MeV, but is simply a representation of the source flux obtained by fitting the entire spectral range, adjusting the spectral index.
If we adopt your scheme, then we must go all the way, add a FLUX30 column which will be used for the full range, obtain FLUX100 from events > 100 MeV only, and adjust the spectral index (when it is possible) in all 'bands'. Then all the 'bands' will have the same meaning.
(Digel) Well, I actually did have in mind that the main flux >100 MeV could be compared with the fluxes for the other integral energy ranges, but I can see that the only use of presenting fluxes for different energy ranges is if they are fit separately. Otherwise a power-law fit contains all of the information.
The disadvantage of band-limited fluxes is that for many (or maybe most) sources they will also be quite statistics limited. I have not tried simulating spectra and making fits for these narrow energy ranges to see how limited we will be, but I guess that somebody should.
So, how about 30-100, 100-300, 300-1000, 1000-3000, and >3000 MeV? (I'm revealing some EGRET heritage in preferring these ranges to ranges of 0.5 in log~10~(E). What about going down to 20 MeV?
(Ballet) I have nothing against splitting at 300 MeV rather than 3162 MeV.
I don't think it is useful to extend down to 20 MeV. The events measured below 30 MeV are for a large part coming from above 30 MeV (the effective area increases steeply with energy there) and they have very bad spatial resolution.
On the other hand, it may be useful to add another band above 3 GeV (split 3 to 10 GeV and > 10 GeV). This will add another piece of information for bright hard sources and will not really change anything for faint sources (most photons above 3 GeV will be in the 3 to 10 GeV band).
It is true that for many sources the only useful bands will be 300 MeV to 1 GeV and 1 to 3 GeV, but I don't see that as a problem.
5. Flux_History
(Ballet) Did anybody study what was the best energy interval (in particular the lower boundary) to look for variability? Is 100 MeV indeed the best trade-off between number of source photons and number of background photons plus contamination by other sources ?
(Digel) This has not been studied, to my knowledge. Certainly 100 MeV is not the best in terms of being able to distinguish sources. Quoting fluxes for the range >100 MeV, though, is common.
The variability index can apply to whatever energy interval we choose, presumably specified along with the definition of the index
6. Hist_Start/End
(Ballet) If I understood correctly the ideas developed in the September 2002 review of how to obtain a light curve, this will be done by running likelihood in each time bin (fixing the diffuse emission and the spectral indices). That document implied that this would be done in smaller time bins for brighter sources.
It would be much more homogeneous to do that in the same time intervals for all sources. It would ensure that the likelihood results for closeby sources would be obtained consistently, and would also be easier to use for systematic studies.
I would like to propose that we do that in large time intervals (like one month) for all sources. So the time interval (not the number of bins) would be fixed. In addition, we could add specific files (one per source) for bright sources where much more detailed information can be obtained (including spectral variability for example, or going beyond 100 bins).
(Digel) I like the idea of homogeneous time intervals in the catalog, but I am not sure what is best for sources that are variable on other time ranges. We'll certainly have ancillary information for the catalog (like images of the confidence contours, I think) that probably won't go in the FITS file for the catalog. I have to be careful about going too far down this path because then it starts to sound like an additional data product that needs defining.
The updated version has fixed one-month time intervals and includes integral fluxes and flux uncertainties (>100 MeV). As written, a fixed size array (12 elements) is specified. We may want to convert these back to variable length arrays so that the specification does not need to change when the time range gets longer.
7. Flags
(Ballet) 1 Byte (8 binary flags) is not much. Let's use I (2 Bytes) instead. This is a negligible size increase anyway.
2-byte integer are now used.
8. Extended Sources
(Digel) Should we expect the catalog to include extended sources in addition to point sources? If so, we should defined an 'extendedness' parameter and possibly also list an angular extent. This can be kind of a slippery slope, getting into semimajor and semiminor axes and position angles. I'd at least include a flag indicating whether a source is resolved.
The catalog does not include an extendedness parameter. If we decide we need one, we can use some of the flag bits defined above.
9. Source Identification
(Digel) For some sources, e.g., bright pulsars, identifications will be possible with high confidence. For most of the sources, though, the best that will be possible is a list of candidate sources. Is it realistic to assume that we will be able to assign confidence levels for assocations with counterpart sources? How about a Sowards-Emmerd-type "figure of merit"? If not, should we at least include angular offsets of the prospective counterparts from the maximum likelihood position of the source?
The updated definition includes space for just one counterpart source and for a flag value that defines the degree of confidence (1 for Figure-of-Merit above some threshold, 2 for correlated variability).
10. HDUCLASS Keywords (26 October 2005)
(Digel) I added placeholders for these. We need to figure out whether we are conforming enough to the HEASARC 'SRCLIST' definition to be able to use it.
11. Peak vs. Average (26 October 2005)
(Digel) It has gone unspecified until now, but the fluxes and flux history values described above are assumed to apply for the entire time range of the catalog (or of the time interval for a flux history evaluation). The catalog also includes Flux_Peak, Unc_peak_Flux, Signif_Peak, Time_Peak, and Peak_Interval to allow for specification of properties of flaring sources that we see only once for a short period of time. Defining significances for a specific time interval may be problematic, as the time interval has been selected to be when the detection significance is greatest.
Draft header (Last update: 26 October 2005)
Here is the FITS version of the first extension header of the FITS version of the catalog, with the changes specified above adopted. Actually, this is pseudo-FITS, as the comments are not properly set off, but all of the columns and keywords are present.
XTENSION = 'BINTABLE' / binary table extension BITPIX = 8 / 8-bit bytes NAXIS = 2 / 2-dimensional binary table PCOUNT = / size of special data area GCOUNT = 1 / one data group (required keyword) TFIELDS = 32 / number of fields in each row CHECKSUM = / checksum for entire HDU DATASUM = / checksum for data table TELESCOP = 'GLAST' / name of telescope generating data INSTRUME = 'LAT' / name of instrument generating data EQUINOX = 2000.0 / equinox for ra and dec RADECSYS = 'FK5' / world coord. system for this file (FK5 or FK4) EXTNAME = 'LAT_Point_Source_Catalog' / name of this binary table extension HDUCLASS = 'OGIP' / format conforms to OGIP standard HDUCLAS1 = 'EVENTS' / extension contains events HDUCLAS2 = 'ALL' / extension contains all events detected TSTART = / mission time of the start of the observation TSTOP = / mission time of the end of the observation TIMEUNIT = 'd' / units for the time related keywords TIMEZERO = 0.0 / clock correction TIMESYS = 'MJD' / type of time system that is used TIMEREF = 'LOCAL' / reference frame used for times DATE = / file creation date (YYYY-MM-DDThh:mm:ss UT) DATE-OBS = / start date and time of the observation (UTC) DATE-END = / end date and time of the observation (UTC) NDSKEYS = 0 / number of data subspace keywords in header HDUCLASS = 'OGIP ' / format conforms to OGIP standard HDUDOC = '?' / document describing the format HDUVERS = '1.0.0 ' / version of the format HDUCLAS1 = 'SRCLIST' / an OGIP standard class TTYPE1 = 'Source_Name' / e.g., 1GL J123456-012345 TFORM1 = '18A ' / character string TUNIT1 = 'none' / units of field TTYPE2 = 'RA' / right ascension of source TFORM2 = 'E' / floating point TUNIT2 = 'deg' / units of field TLMIN2 = 0.0 / minimum value TLMAX2 = 360.0 / maximum value TTYPE3 = 'DEC' / declination of source TFORM3 = 'E' / floating point TUNIT3 = 'deg' / units of field TLMIN3 = -90.0 / minimum value TLMAX3 = 90.0 / maximum value TTYPE4 = 'Conf_68_SemiMajor' / semimajor axis, 68% containment confidence region TFORM4 = 'E' / floating point TUNIT4 = 'deg' / units of field TLMIN4 = 0.0 / minimum value TLMAX4 = 360.0 / maximum value TTYPE5 = 'Conf_68_SemiMinor' / semiminor, axis, 68% containment confidence region TFORM5 = 'E' / floating point TUNIT5 = 'deg' / units of field TLMIN5 = 0.0 / minimum value TLMAX5 = 360.0 / maximum value TTYPE6 = 'Conf_68_PosAng' / position angle, 68% containment confidence region, E of N TFORM6 = 'E' / floating point TUNIT6 = 'deg' / units of field TLMIN6 = 0.0 / minimum value TLMAX6 = 360.0 / maximum value TTYPE7 = 'Conf_95_SemiMajor' / semimajor axis, 95% containment confidence region TFORM7 = 'E' / floating point TUNIT7 = 'deg' / units of field TLMIN7 = 0.0 / minimum value TLMAX7 = 360.0 / maximum value TTYPE8 = 'Conf_95_SemiMinor' / semiminor, axis, 95% containment confidence region TFORM8 = 'E' / floating point TUNIT8 = 'deg' / units of field TLMIN8 = 0.0 / minimum value TLMAX8 = 360.0 / maximum value TTYPE9 = 'Conf_95_PosAng' / position angle, 95% containment confidence region, E of N TFORM9 = 'E' / floating point TUNIT9 = 'deg' / units of field TLMIN9 = 0.0 / minimum value TLMAX9 = 360.0 / maximum value TTYPE10 = 'Flux100' / average photon flux >100 MeV TFORM10 = 'E' / floating point TUNIT10 = 'cm**(-2) s**(-1)' / units of field TLMIN10 = 0.0 / minimum value TLMAX10 = 1.0 / maximum value TTYPE11 = 'Unc_Flux100' / uncertainty (1-sigma) in average flux >100 MeV TFORM11 = 'E' / floating point TUNIT11 = 'cm**(-2) s**(-1)' / units of field TLMIN11 = 0.0 / minimum value TLMAX11 = 1.0 / maximum value TTYPE12 = 'Flux300' / average photon flux >300 MeV TFORM12 = 'E' / floating point TUNIT12 = 'cm**(-2) s**(-1)' / units of field TLMIN12 = 0.0 / minimum value TLMAX12 = 1.0 / maximum value TTYPE13 = 'Unc_Flux300' / uncertainty (1-sigma) in average flux >300 MeV TFORM13 = 'E' / floating point TUNIT13 = 'cm**(-2) s**(-1)' / units of field TLMIN13 = 0.0 / minimum value TLMAX13 = 1.0 / maximum value TTYPE14 = 'Flux1000' / average photon flux >1000 MeV TFORM14 = 'E' / floating point TUNIT14 = 'cm**(-2) s**(-1)' / units of field TLMIN14 = 0.0 / minimum value TLMAX14 = 1.0 / maximum value TTYPE15 = 'Unc_Flux1000' / uncertainty (1-sigma) in average flux >1000 MeV TFORM15 = 'E' / floating point TUNIT15 = 'cm**(-2) s**(-1)' / units of field TLMIN15 = 0.0 / minimum value TLMAX15 = 1.0 / maximum value TTYPE16 = 'Flux3000' / average photon flux >3000 MeV TFORM16 = 'E' / floating point TUNIT16 = 'cm**(-2) s**(-1)' / units of field TLMIN16 = 0.0 / minimum value TLMAX16 = 1.0 / maximum value TTYPE17 = 'Unc_Flux3000' / uncertainty (1-sigma) in average flux >1000 MeV TFORM17 = 'E' / floating point TUNIT17 = 'cm**(-2) s**(-1)' / units of field TLMIN17 = 0.0 / minimum value TLMAX17 = 1.0 / maximum value TTYPE18 = 'Spectral_Index' / photon spectral index, >100 MeV TFORM18 = 'E' / floating point TUNIT18 = 'none' / dimensionless TLMIN18 = -10.0 / minimum value TLMAX18 = 10.0 / maximum value TTYPE19 = 'Unc_Spectral_Index' / 1-sigma uncertainty, photon spectral index TFORM19 = 'E' / floating point TUNIT19 = 'none' / dimensionless TLMIN19 = 0.0 / minimum value TLMAX19 = 10.0 / maximum value TTYPE20 = 'Variability_Index' / flux variability index (TBD) TFORM20 = 'E' / floating point TUNIT20 = 'none' / dimensionless TLMIN20 = ### / minimum value (TBD) TLMAX20 = ### / maximum value (TBD) TTYPE21 ='Signif_Avg' / detection significance (whole time interval) TFORM21 = 'E' / floating point TUNIT21 = 'none' / dimensionless (sigmas) TLMIN21 = 0.0 / minimum value TLMAX21 = 1.0E9 / maximum value TTYPE22 = 'Signif_Peak' / detection significance (peak) TFORM22 = 'E' / floating point TUNIT22 = 'none' / dimensionless (sigmas) TLMIN22 = 0.0 / minimum value TLMAX22 = 1.0E9 / maximum value TTYPE23 = 'Flux_Peak' / peak flux (>100 MeV) for time interval above TFORM23 = 'E' / floating point TUNIT23 = 'cm**(-2) s**(-1)' TLMIN23 = 0.0 / minimum value TLMAX23 = 1.0 / maximum value TTYPE24 = 'Unc_Peak_Flux' / uncertainty (1-sigma) in peak flux >100 MeV TFORM24 = 'E' / floating point TUNIT24 = 'cm**(-2) s**(-1)' / units of field TLMIN24 = 0.0 / minimum value TLMAX24 = 1.0 / maximum value TTYPE25 = 'Time_Peak' / center of time interval of peak significance TFORM25 = 'D' / double precision TUNIT25 = 'd' / units of field TLMIN25 = 0.0 / minimum value TLMAX25 = 1.0D5 / maximum value TTYPE26 = 'Peak_Interval' / duration of time interval of peak significance TFORM26 = 'D' / double precision TUNIT26 = 's' / units of field TLMIN26 = 0.0 / minimum value TLMAX26 = 3.0D7 / maximum value TTYPE27 = 'Flux_History' / flux (>100 MeV) history (monthly) TFORM27 = '12E' / floating point array, 12 months (number TBR) TUNIT27 = 'cm**(-2) s**(-1)' / units of field TLMIN27 = 0.0 / minimum value TLMAX27 = 1.0 / maximum value TTYPE28 = 'Flux_Unc_History' / flux uncertainty (1-sigma, >100 MeV) history TFORM23 = '12E' / floating point array, 12 months (number TBR) TUNIT23 = 'cm**(-2) s**(-1)' / units of field TLMIN23 = 0.0 / minimum value TLMAX23 = 1.0 / maximum value TTYPE29 = 'Hist_Start' / start of time intervals of flux history TFORM29 = '12E' / floating point array, 12 months (number TBR) TUNIT29 = 'd' / units of field TLMIN29 = 0.0 / minimum value TLMAX29 = 1.0D5 / maximum value TTYPE30 = 'ID_Counterpart' / source counterpart (if any) TFORM30 = '20A' / character string TUNIT30 = 'none' / dimensionless TTYPE31 = 'Conf_Counterpart' / confidence of association of counterpart with source TFORM31 = 'I' / index, 1 = Figure of Merit, 2 = Correlated variability TUNIT31 = 'none' / dimensionless TLMIN31 = 0 / minimum value TLMAX31 = 2 / maximum value TTYPE32 = 'Flags' / flags (TBD) for catalog entry TFORM32 = 'I' / integer TUNIT32 = 'none' / dimensionless END
Comments on the draft of 26 October 2005
(Luigi Foschini)
Keywords to add:
EXTREL: release number of the template for the FITS header, to take into account for future developments and changes in the header.
CREATOR: the name and version of the executable that generated the FITS file.
CONFIGUR: name and version of the software system under which the executable run (e.g. SAE v X.x).
DATE: date of the creation of the FITS file.
TIMEREF: time reference frame (LOCAL, SOLAR SYSTEM, etc...).
TIMEUNIT: I suggest to change to days (JD), so that to use MJDREF as TZERO and it is possible to avoid huge numbers; TSTART and TSTOP should be updated accordingly.
VERSION: version of the catalog.
RADECSYS: FK5 default; stellar reference frame.
EQUINOX: 2000.0 default; coordinate system equinox.
I would add also a new column "NOTES" (character string) where to place some comments, like, for example, other names of the sources (e.g. the corresponding name in the 3rd EGRET catalog, etc...).
(Seth Digel, 30 December 2005)
I have updated the draft header above to take into account Luigi's comments. I also reformatted it to make it more like an actual template FITS header.
The updated draft includes DATE, TIMEREF, RADECSYS, and EQUINOX.
I omitted EXTREL because I believe the same information would be conveyed by HDUVERS.
CREATOR and VERSION are assumed to be in the main header for the file, which is not shown. To the extent possible we will have a common format for the primary headers of all our FITS data products; I intend to post a template for comment. CONFIGUR is assumed to be in the main header as well, with the name SOFTWARE
TIMEUNIT is changed to days, TIMESYS to 'MJD' and MJDREF is omitted. These changes, I think, permit the dates in the flux histories to be represented as MJD values. I think that we want these times in MJD rather than seconds of MET (as I had originally proposed) or as days with respect to January 1, 2001 (as Luigi proposed). Also I think that the description is correctly expressed in column 26 so that the duration of the interval used for the peak flux evaluation is in days. A detailed description of representing time in Chandra FITS files is available here (see section 2). I need to study it some more.
NOTES is omitted; the proposed use is good - especially for providing other names for identified sources - but how to make a NOTES column conveniently searchable or even figuring out how large a field to reserve is not clear. We'll have to revisit this.
Also, regarding flux histories, I am assuming that in columns 27 & 28, for intervals during which a source was not detected we'll have its flux entry as 0 and its flux uncertainty should be interpreted as a (2 sigma?) upper limit.