The easiest way to get a list of files is by using the linemode client 'find' command.  Here's it's help screen:

noric13:dflath> /afs/slac/g/glast/ground/bin/datacat -h find
Command-specific help for command find

Usage: datacat find [-options] <logical folder>

parameters:
  <logical folder>   Logical Folder Path at which to begin performing the search.

options:
  --recurse                              Recurse sub-folders
  --search-folders                    Search for datasets inside folder(s)
  -search-groups                    Search in groups.  This option is superseded by the -G (-group) option if they are both supplied.
  --group <group name>          Dataset Group under which to search for datasets.
  --site <site name>                Name of Site to search.  May be used multiple times to specify a list of sites in which case order is taken as preference.  Defaults to the Master-location if not provided.
  --filter <filter expression>     Criteria by which to filter datasets.  ie: 'DataType=="MERIT" && nMetStart>=257731220 && nMetStop <=257731580'
  --display <meta name>         Name of meta-data field to display in output.  Default is to display only the file location.  May be used multiple times to specify an ordered list of fields to display.
  --sort <meta name>             Name of meta-data field to sort on.  May be used multiple times to specify a list of fields to sort on.  Order determines precedence.
  -show-unscanned-locations   If no "OK" (ie: verified by file crawler) location exists, display first location (if any) which has not yet been scanned.  If this option and '-show-non-ok-locations' are both specified, an unscanned location will be returned before a non-ok location regardless of their sequence in the ordered site list.
  --show-non-ok-locations        If no "OK" (ie: verified by file crawler) location exists, display first location (if any) which exists in the list of sites.

More detail:

Arguments to 'find' command: 

Argument

Explanation

<logical folder>

This parameter is required and comes after all options are specified.  Replace it with the Data Catalog folder path where you want to begin your search.
The more specific you are, the faster your search will be.

--recurse

If you specify this option, the find command will traverse the entire folder tree under <logical folder> searching for datasets that meet your criteria.

--search-folders

Tells find to look inside the folder (or folders if --recurse specified) and consider datasets that live there.  In general, datasets live in groups, and this option is not used.

--search-groups

Tells find to look inside all dataset groups in the specified folder (or folder tree if using --recurse) for your files.  May be combined with --search-folders and --recurse.
Has no meaning if --group is also specified.

--group <group name>

Tells find to look inside groups only if they have the name specified by <group name>.  May be used with --recurse to search in groups of the given name in a folder tree.

--site <site name>

Specifies a specific site you want to get datasets from.  May be used multiple times to specify a list of sites to search where order indicates preference.
If no sites are specified, the 'master' location is returned.  This will generally be a file in XROOT.  More information about sites below.

--filter <filter expression>

Also known as "search criteria" is an expression using logical operators, meta-data fields and constant values on which to filter the output results.  See below for details.

--display <meta name>

Causes the meta-data value associated with 'meta name' to be displayed in the output. May be used multiple times. Columns are tab-separated.

--sort <meta name>

Specifies the name of the specific meta-data field on which to sort the output results.  May be used multiple times to specify a list of fields to sort on where order indicates preference.
Sorting may add a significant overhead to the time it takes to start getting results, as the entire output set must be calculated, then sorted, before being displayed.
Ascending order (smallest first) is the default for each field.  You may override this on a field by field basis by prefixing a field name with '-' (minus) for descending order, or '+' (plus) for ascending order.

--show-unscanned-locations

If a verified disk location can not be found in the specified site-list, the first location (site-preference order) which has not been scanned yet will be returned.

--show-non-ok-locations

Similar to --show-unscanned-locations, but will return the first location in the ordered site list that has a disk location regardless of the file scan-status. The file may be missing, 'bad', or otherwise. (Caveat emptor.) If you specify this option in addition to the --show-unscanned-locations option. An unscanned location will be returned before a non-ok location if both exist.

*Important note:  At least one of --search-folders, --search-groups, --group <group name> must be specified.

<site name> Valid values:

SLAC_XROOT

XROOT servers at SLAC.  Almost everything lives here.

SLAC

NFS (or AFS) at SLAC.  Some FT1 and FT2 data are duplicated here until the ftools learn to read from XROOT.

IN2P3

Some Monte Carlo data are produced and stored at Lyon

IN2P3_HPSS

Lyon Monte Carlo backups

UW

University of Washington.

<filter expression> Specifics:

An expression composed of logical, arithmetic, and comparison operators along with meta-data fields used to select datasets that meet specific criteria. 

Here's a loose grammar which defines the filter expressions:

Expr ::= Expr
Expr ::= '(' Expr ')'
Expr ::= '!' Expr
Expr ::= Expr LogOp Expr
LogOp ::= '&&' | '||'
Expr ::= Comparable CmpOp Comparable
CmpOp ::= '==' | '!=' | '>=' | '<='
Comparable ::= "String"                --> a String constant must be enclosed in double quotes
Comparable ::= Number
Comparable ::= Identifier
Comparable ::= UnOp Comparable
UnOp ::= '-' | '+'
Comparable ::= Comparable BinOp Comparable
BinOp ::= '/' | '*' | '+' | '-'

Meta Data fields you can use in your expressions:

System maintained (build in) meta-data:

Name

Type

Description

Name

String

Dataset Name

FileFormat

String

File encoding. ex: "root", "fits"

DataType

String

Type of data in file. Always uppercase. ex: "RECON"

VersionID

Integer

Version of the Dataset this file represents.

CreateDate

Timestamp

Date this Version of the Dataset was created. Example formats of to_date(...)

Source

String

What created this Version of the Dataset. ex: "PIPELINE", "LINEMODE CLIENT"

TaskName

String

If Source=="PIPELINE" this will contain the name of the Task which created this Version of the Dataset.

RunMin

Long Integer

Smallest Run Identifier found in this file, if applicable.

RunMax

Long Integer

Largest Run Identifier found in this file, if applicable.

NumberEvents

Long Integer

Number of events in the file, if applicable.

FileSizeBytes

Long Integer

Size of this file on disk, in bytes.

RootVersion

String

If FileFormat=="root", the version of root which wrote this file.

SOLibVersion

String

If FileFormat=="root", the version of the shared object library that the events correspond to, if applicable.

TTreeName

String

If FileFormat=="root", the name of the first TTree in the file, if one exists.

User-defined meta-data tags. In order of most used (first) to least used (last):

(Feel free to fill in the description and DataType field for those you are responsible for.)

Name

Type

Description

Data-Type(s) generally tagged

sDatasource

STRING

 

 

nMetStop

NUMBER

 

 

nMetStart

NUMBER

 

 

sOrigFilename

STRING

 

 

nOrigBytes

NUMBER

 

 

nOrigCkSum

NUMBER

 

 

sBTRversion

STRING

 

 

sPhysList

STRING

 

 

nBtRunId

NUMBER

 

 

sDataSource

STRING

 

 

nDownlink

NUMBER

 

 

nRun

NUMBER

 

 

sRunStatus

STRING

 

 

sCreator

STRING

 

 

sIntent

STRING

 

 

nMootKey

NUMBER

 

 

type

STRING

 

 

packetTime

STRING

 

 

packetApid

STRING

 

 

startAddress

STRING

 

 

functionCode

STRING

 

 

stopAddress

STRING

 

 

transactionId

STRING

 

 

tstop

STRING

 

 

tstart

STRING

 

 

nMootKey

STRING

 

 

startedAt

STRING

 

 

firstTimeStamp

STRING

 

 

counterType

STRING

 

 

lastTimeStamp

STRING

 

 

nDatasetId

NUMBER

 

 

TCut

STRING

 

 

Examples:

All the FT1 files in a given run-range, sorted by nMetStart:

noric13:dflath> /afs/slac/g/glast/ground/bin/datacat find \--filter 'RunMin>=236191699  && RunMax<=236211846'
\--sort nMetStart \--group FT1 /Data/Flight/Level1/LPA/

root://glast-rdr.slac.stanford.edu//glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236191699_v002.fit
root://glast-rdr.slac.stanford.edu//glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236197643_v001.fit
root://glast-rdr.slac.stanford.edu//glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236198321_v001.fit
root://glast-rdr.slac.stanford.edu//glast/Data/Flight/Level1/LPA/prod/1.56/ft1/gll_ph_r0236209517_v001.fit
root://glast-rdr.slac.stanford.edu//glast/Data/Flight/Level1/LPA/prod/1.56/ft1/gll_ph_r0236211846_v001.fit

The same search, but retrieving their SLAC NFS location rather than their master (default) location:

noric13:dflath> /afs/slac/g/glast/ground/bin/datacat find \--filter 'RunMin>=236191699 &&
RunMax<=236211846' \--sort nMetStart \--group FT1 \--site SLAC /Data/Flight/Level1/LPA/

/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236191699_v002.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236197643_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236198321_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft1/gll_ph_r0236209517_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft1/gll_ph_r0236211846_v001.fit

SLAC NFS locations of all the FT1 and FT2 datasets in a given run-range, grouped by dataset name (perhaps you have a tool that wants the ft1 and it's corresponding ft2 file listed on consecutive lines):

noric13:dflath> /afs/slac/g/glast/ground/bin/datacat find \--filter '(DataType=="FT1" \|\| DataType=="FT2")
 && RunMin>=236191699  && RunMax<=236211846' \--sort Name \--sort DataType \--search-groups \--site SLAC
/Data/Flight/Level1/LPA/

/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236191699_v002.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236191699_v002.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236197643_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236197643_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236198321_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236198321_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft1/gll_ph_r0236209517_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft2/gll_pt_r0236209517_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft1/gll_ph_r0236211846_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft2/gll_pt_r0236211846_v001.fit

The same search, but with 'nMetStart' and 'nMetStop' as columns in the output:

noric13:dflath> /afs/slac/g/glast/ground/bin/datacat find \--filter '(DataType=="FT1" \|\| DataType=="FT2") &&
 RunMin>=236191699  && RunMax<=236211846' \--display nMetStart \--display nMetStop \--sort Name \--sort DataType
\--search-groups \--site SLAC /Data/Flight/Level1/LPA/

/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236191699_v002.fit        236191701.95599103      236195764.0891509
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236191699_v002.fit        236191701.95599103      236195764.0891509
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236197643_v001.fit        236197645.96239495      236198115.084378
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236197643_v001.fit        236197645.96239495      236198115.084378
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft1/gll_ph_r0236198321_v001.fit        236198324.12405705      236201952.08423495
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236198321_v001.fit        236198324.12405705      236201952.08423495
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft1/gll_ph_r0236209517_v001.fit        236209519.9578979       236211816.08435512
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft2/gll_pt_r0236209517_v001.fit        236209519.9578979       236211816.08435512
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft1/gll_ph_r0236211846_v001.fit        236211848.96936107      236214145.084764
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft2/gll_pt_r0236211846_v001.fit        236211848.96936107      236214145.084764

All the FT2 files available from SLAC NFS in no particular order:

noric15:dflath> datacat find \--site SLAC \--group FT2 /Data/Flight/Level1/LPA/

/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.58/ft2/gll_pt_r0236511638_v003.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236339577_v000.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236345681_v000.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236409925_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.58/ft2/gll_pt_r0236443723_v002.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236271733_v000.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.56/ft2/gll_pt_r0236090205_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/1.57/ft2/gll_pt_r0236351742_v002.fit
... etc ...

All the FT1 files generated after a specific date, sorted by nMetStart:

noric13:dflath> /afs/slac/g/glast/ground/bin/datacat find \--filter 'CreateDate >= to_date("20110411","yyyymmdd")' \--sort nMetStart \--group FT1 \--site SLAC /Data/Flight/Level1/LPA/

/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/2.6/ft1/gll_ph_p116_r0324173700_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/2.6/ft1/gll_ph_p116_r0324179659_v000.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/2.6/ft1/gll_ph_p116_r0324185343_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/2.6/ft1/gll_ph_p116_r0324190937_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/2.6/ft1/gll_ph_p116_r0324196664_v000.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/2.6/ft1/gll_ph_p116_r0324202392_v000.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/2.6/ft1/gll_ph_p116_r0324208120_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/2.6/ft1/gll_ph_p116_r0324213847_v001.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/2.6/ft1/gll_ph_p116_r0324219575_v000.fit
/nfs/farm/g/glast/u20/FT1-2copies/glast/Data/Flight/Level1/LPA/prod/2.6/ft1/gll_ph_p116_r0324223503_v000.fit
  • No labels