Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

...

  • The directory /nfs/farm/g/glast/u26/MC-tasks/obssim-ST-v7r6p1/output contains approximately 1500 runs

Experiment 1

No Format
[~tonyj:glastlnx07] /nfs/farm/g/glast/u26/MC-tasks/obssim-ST-v7r6p1/output > /usr/bin/time grep cob0343 */logFile.txt
0.16user 0.13system 0:00.73elapsed 39%CPU

Note This does not scale to large number of directories, since */logFile.txt is expanded by the shell and eventually the expanded line becomes too long.

Experiment 2

No Format
[~tonyj:glastlnx07] /nfs/farm/g/glast/u26/MC-tasks/obssim-ST-v7r6p1/output > /usr/bin/time find */logFile.txt -exec grep cob0343 \{\} \;
0.44user 1.95system 0:03.41elapsed 69%CPU

Note This command scales much better. Note that grep is invoked 1500 times in this case, but that does not seem to introduce a huge overhead.

Experiment 3

No Format
[~tonyj:glastlnx07] /nfs/farm/g/glast/u26/MC-tasks/obssim-ST-v7r6p1/output > ls -1 */logFile.txt > /tmp/file.list
cat /tmp/file.list | /usr/bin/time xargs -i grep cob0343 \{\}
0.50user 1.80system 0:03.58elapsed 64%CPU

...

Proposal

Define a new command "pfindpipeline find" which is able to return a list of files or directories. This can then be used with xargs (see experiment 3 above). So to search all log files we could use the command:

No Format
pfindpipeline --taskfind obssim-ST-v7r6p1 --process obssim --logfilelogFile | xargs -i grep cob0343 \{\}

or to delete all obsolete working directories we could use

No Format
pfindpipeline --taskfind obssim-ST-v7r6p1 --process obssim --logfilelogFile | xargs -i grep cob0343 \{\}

pfind arguments (work in progress)

Syntax

pipeline find <options> <task-name> <process-name> [<output> ,<output>...]

<task-name> --task <name>

The task on which to operate. Could allow wildcards or lists for multiple tasks . Can include version and subtasks, e.g. parent(1.0)/child

<process-name> --process <name>

The process name. Could allow wildcards

--logfile

Produce the list of log files

--workingDir

Produce a list of working directories

<output>

An item to output. Defaults to workingDir. See valid items below.

--latest

Show only "latest" items

--all

All (not only latest)

--obsolete

all - latest

--stream <run-range-list>

List of stream ranges (not yet implemented)

--filter <filter-spec>

Filter the results (e.g. exitcode != 0). Filters can use any of the supported output items, including meta-data

Supported output items

Item

workingDir

exitCode

stream

createDate

submitDate

endDate

cpuSecondsUsed

host

exitCode

logFile

jobId

executionNumber

isLatest

streamPath

or any meta-data item associated with the task.

Example

No Format

pipeline find backgndSC-GR-v10r4 runMonteCarlo -s logFile exitCode stream evtsSim evtsOut --filter "evtsOut>200"