The GLAST Data Catalog is a virtual file system maintained in an Oracle database. GLAST data may be stored at several locations at SLAC, Lyon and elsewhere. The files themselves may be stored on disk in AFS, NFS, or XROOTD managed servers, in one of several tape archive systems, or any combination of these. The Data Catalog simplifies access to data by providing a uniform view of files irrespective of their physical location. The Data Catalog provides features that are not available in standard file systems. These include:
Access to the Data Catalog is provided via a Java API. This API is under continued development and features are regularly added. Any Java program running within the SLAC firewall may use this API to take advantage of the full Data Catalog feature set. The Java API is provided in a child page for reference (See link at bottom of this page.) Currently there are two methods of accessing the API:
The Line-mode client is available from the UNIX command line at SLAC. It represents a subset of the full Data Catalog API.
The Data Catalog Line-mode executable is available at:
/afs/slac.stanford.edu/g/glast/ground/bin/datacat
Invoking the executable with no parameters will display the help screen. One may obtain command-specific help explicitly by executing:
/afs/slac.stanford.edu/g/glast/ground/bin/datacat -h <command>
(The usage is similar to CVS) The following commands are currently available:
Adds a new dataset to the catalog
datacat registerDataset [-options] <data type> <logical folder> <file path>
<data type> Type of data in the file (merit, MC, DIGI, RECON, etc.) See Java API child page for a full list.
<logical folder> Dataset Folder Path under which to create the new dataset.
<file path> Physical location of file to add to Data Catalog.
Long Form |
Short Form |
Parameter |
Default Value |
Description |
---|---|---|---|---|
--name |
-n |
dataset name |
file name |
Name to give new dataset in the catalog |
--group |
-G |
group name |
none |
Group under which to store the dataset |
--format |
-F |
file format |
file extension |
Format of the file (root, fits, etc.) |
--site |
-S |
site name |
SLAC |
Site where dataset physically exists (SLAC, SLAC_XROOT, etc.) |
--define |
-D |
"name=value" |
none |
Define a meta data name/value pair for the new dataset. This option may be used more than once. For naming rules, see the Java API child page |
datacat registerDataset -n 000002 -G merit -D nEvt=2500 -S SLAC -F root merit /ServiceChallenge/Interleave3h-GR-v11r17/runs /nfs/farm/g/glast/u43/MC-tasks/Interleave3h-GR-v11r17/data/merit/Interleave3h-GR-v11r17-000002-merit.root
Adds an additional physical location to an existing dataset. Use this routine to specify that a dataset exists in more than one physical location (ie: it's on SLAC NFS and in SLAC XROOT.) Except for <file path> all of the parameters and options are used to identify the existing dataset entry to which you want to add an additional physical location.
datacat addLocation [-options] <dataset name> <logical folder> <file path>
<dataset name> Name of existing dataset
<logical folder> Data Catalog Folder Path under which the dataset lives.
<file path> Additional physical location of file to add to the dataset entry.
Long Form |
Short Form |
Parameter |
Default Value |
Description |
---|---|---|---|---|
--group |
-G |
group name |
none |
Dataset Group in the Data Catalog under which the dataset lives. |
--site |
-S |
site name |
SLAC |
Site at at which the additional physical location exists. |
datacat addLocation -G merit -S SLAC_XROOT 000002 /ServiceChallenge/Interleave3h-GR-v11r17/runs root://glast-rdr//glast/mc/ServiceChallenge/Interleave3h-GR-v11r17/merit/Interleave3h-GR-v11r17-000002-merit.root
Adds meta data entrie(s) to an existing dataset.
datacat addMetaData [-options] <logical folder>
Required Parameters:
<logical folder> Logical Folder Path where the group or dataset lives, or to tag with meta data if no dataset or group specified.
Long Form |
Short Form |
Parameter |
Default Value |
Description |
---|---|---|---|---|
--dataset |
-n |
dataset name |
file name |
Name of existing dataset |
--group |
-G |
group name |
none |
Dataset Group in the Data Catalog under which the dataset lives. |
--define |
-D |
"name=value" |
none |
Define a new meta data name/value pair for the dataset. This option may be used more than once. (And must be used at least once!) For naming rules, see the Java API child page |
datacat addMetaData -n 000002 -G merit -D nEvt=2500 /ServiceChallenge/Interleave3h-GR-v11r17/runs
Jython scriptlet processes withing the pipeline enjoy access to the full Java API. Access to the Data Catalog is provide via an object named "datacatalog".
As an example, dataset registration is performed by calling:
datacatalog.registerDataset(DATA_TYPE, DATA_CATALOG_LOCATION, DISK_LOCATION [, META_DATA])
where:
Below is an example. The parameters are interpreted as follows:
datacatalog.registerDataset("merit","/ServiceChallenge/Interleave3h-GR-v11r17/runs/merit:000002","/nfs/farm/g/glast/u43/MC-tasks/Interleave3h-GR-v11r17/data/merit/Interleave3h-GR-v11r17-000002-merit.root")