Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This tutorial explains how to run org.lcsim in a batch computing environment like a Unix command line or from a shell script that could be run on the Grid or your local batch computing system. The user provides what is typically called a "steering" file in HEP. It specifies all the parameters of the batch job. These steering files may have the extension .xml, but it is recommended to use .lcsim instead, to avoid ambiguity with other markup formats.

Setup

Follow the instructions for building lcsim software using maven2. This should result in a working lcsim setup on your local system.

You can now run lcsim from the command-line using the java command from the lcsim directory.

No Format
cd /my/lcsim/dir # where is your lcsim?
java -server -jar ./target/lcsim-[VERSION]-bin.jar [XML]myjob.lcsim

The VERSION is replaced by your lcsim build version to point to the actual "bin" file in your target directory.

The XML argument points to a local myjob.lcsim argument is an example name of a file in the lcsim recon reconstruction XML format.

For example...

...

.

...

Simple Job Example

The JobManager class processes your job, which is written in an xml format.

Here is a simple example which will print the event number.

...

The inputFiles section is a list of one or more LCIO input file paths that will be processed. There are actually multiple ways to specify input files (covered below).

The control section sets the jobs run parameters. Here we set the maximum numberOfEvents to 100.

The execute section is a list of drivers to be executed in order. The name field of the driver element must correspond with a valid driver.

Finally, the drivers section describes the drivers that will be run on the input file. Certain types of Driver parameters can be set in this section. Here the interval for event printing is set as eventInterval, which is an integer.

The signature for this Driver method looks like this.

No Format
public void setEventInterval(int eventInterval);

The JobManager LCSim is able to convert from xml to these simple setters using JavabeansXML parameters to method calls on Drivers.

LCSim XML Format

This The below pseudo-XML shows all possible XML elements in the LCSim format.

...

The <inputFiles> section contains a list of local or remote files to be processed. It may contain a mixture of any of the elements described below, but it may not be empty. And it must result in at least one input file being found or the job will fail.

file

These can be <file> elements which contain a relative or absolute path to a file on the local file system.

No Format
<inputFiles>
    <file>/path/to/local/datafile.slcio</file>
</inputFiles>

fileUrl

Remote files that accessible via a public URL can be accessed using a <fileUrl> element.

No Format
<inputFiles>
    <fileUrl>ftp://example.org/datafile.slcio</fileUrl>
</inputFiles>

Some batch systems may not support remote file access via a URL. Check with your administrator.

These remote files will be downloaded to the cache directory, which is ~/.cache, by default. A different local cache directory can be specified using the <cacheDirectory> tag (covered below).

The <inputFiles> section can contain a mixture of <file> and <fileUrl> objects.

in the <control> section.

fileSet

Sets of files on the local filesystem with the same base directory can be specified by using the <fileSet> element.

No Format

<fileSet baseDir="/my/data/dir">
    <file>events1.slcio</file>
    <file>events2.slcio</file>

When processing these files, the base direcotry "/my/data/dir" will be prepended to each file to make a complete file path.

fileList

The <fileList> element should point to a text file containing a list of files, one per line.

For instance, say that you had a local text file at /example/mylciofiles.txt containing paths to local LCIO files.

No Format

/my/data/dir/events1.slcio
/my/data/dir/events2.slcio

This can be fed into LCSim using this XML code.

No Format

<fileList>/example/mylciofiles.txt</fileList>

h4 fileUrlList

The <fileUrlList> is similar to the <fileList> except it contains URL's to online data instead of paths on the local file system. For instance, the fileUrlList could point to files available via the http or ftp protocolsSome batch systems may not support remote file access via URL. Check with your administrator.

Job Control

The <control> section contains parameters that control the batch job, including the number of events to run and whether various debugging output should be printed.

...

The <skipEvents> argument tells the job manager to skip a number of events up-front before processing the rest.

The <verbose> tag should be set to true for verbose debugging output.

These tags can also be set to true to print out additional information about the job: <printDriverStatistics>, <printSystemProperties>, <printUserClassPath>, and <printDriversDetailed>.

The <verbose> tag should be set to true for verbose debugging output. This turns on all of the "print" elements described above, which can still be turned off individually by setting them to false after verbose has been turned on.

Variable Definitions

The job manager has very limited support for "free" variable definitions, using the <define> block.

...

No Format
<define>
    <aDoubleParam1>1.1</aDoubleParam1>
    <aDoubleParam2>2.2</aDoubleParam2>
    <aDoubleParam3>aDoubleParam1 + aDoubleParam2</aDoubleParam3>
</define>

Variables defined here are also available when passing values to Drivers (covered in the next section).

Driver Execution

The <execute> section specifies the order in which the drivers will be called for each event. Each <driver> tag must have a unique name attribute value that matches the name of a driver defined in the <drivers> section (see next section).

...

The <drivers> section contains definitions for all drivers that will be called in the job. These drivers need to be defined in the LCSim package jar or any of the jars in the <classpath>.

Driver Arguments

Using Javabeans, the job manager is able to convert simple LCSim can convert XML text into parameter arguments for driversDriver methods. Only simple method signatures with single arguments are supported, and there is a limited amount of types included in this binding.

Here is a table of supported parameter types.

type

array1d

array2d

expression

int

yes

yes

yes

String

yes

no

no

double

yes

yes

yes

float

yes

no

yes

boolean

yes

no

no

Hep3Vector

no

no

no

File

no

no

no

URL

no

no

no

Types with a "yes" in the array1d or array2d columns support arrays of those dimensions. Arrays beyond two dimensions are not supported and would need to be read in manually by user code, perhaps using a method with a File or URL argument. Types that support expression evaluation have a "yes" in that the expression column.

Driver Example

The easiest way to understand how the driver parameter conversion works is to study a simple an example.

Here is an example Driver class with a number of setter methods.

No Format
package org.lcsim.example;

public class MyDriver
{
    public void setX(int x);
    public void setX1(int[] xx1);
    public void setX2(int[][] x)2;
  
    public void setFile(File f);
    public void setUrl(URL url);
    public void setVector(Hep3Vector vec);
}

In a real Driver, the methods would be defined to Implementation of these methods, which would set private variables to the argument valuespassed arguments, but this is left out of the example for brevity.

This is the corresponding XML code in <drivers> that would set pass values for to each of these parametersmethods.

No Format
<driver name="MyDriver" type="org.lcsim.example.MyDriver">
    <x>1</x>
    <x1>1 2 3</x1>
    <x2>1 2 3; 4 5 6</x2>

    <file>/path/to/a/file.txt</file>
    <url>http://example.org/file.txt</url>
    <vector>1.0 2.0 3.0</vector>
</driver>

There are several important things to notice in this example.

By Javabeans' convention, the The set methods are transformed into matched to parameter names by removing the "set" string from the method name and making the first letter of the parameter lower case. The Driver set methods must begin with "set", or they will not be accessible from LCSim XMLbe ignored and not matched with any input parameters.

Multi-dimensional arguments are space delimited, meaning String arguments should not have spaces.

The rows in 2D arrays are separated by semicolons.

In the above example, integers are used for the 1D and 2D arrays, but other types support arrays, also. See the types table above for specifics.

Expression Evaluation

Simple expression evaluation is supported for a limited set of the supported parameter types, including int, double, and float, plus 1D or 2D arrays of these types. Supported symbols include *, /, +, (, ), and -, which have their usual mathematical meaning, plus trig functions like sin and cos. Variables created in <define> can also be accessed by their name. Expressions may have units, also. (see next section)

The GNU JEL library provides this capability. Refer to its documentation for further information on the expression format.

Units

LCSim supports the named units defined by CLHEP's SystemOfUnits.

...

Guidelines for Creating Compatible Drivers

Drivers created for use in that will be accessed via an LCSim XML file should need to follow these guidelines.

  • The Driver class must be public.
  • It The Driver class must have a public constructor that takes no arguments.
  • The driver Driver's constructor should not do initialization in the constructor but should any initialization. It should instead use the detectorChanged() or startOfData() methods, insteadwhich are called after all input parameters are processed.
  • The driver's set methods to be accessed in the XML should always be of the form
    No Format
    public void set[ParameterName]([type] [varName])
    Set methods not of this form will not be accessible as XML parameters.
  • The use of sub-drivers is discouraged due to these being inaccessible to the job manager, though it is still possible to use them. Any dependence of child Driver's on XML input parameters can be handled by using the startOfData() method to add a new child Driver instance.

Running a Specific LCSim Release

How to Run a Specific Release

Running your job with the lcsim jar is straightforward. Download the bin jar from the repository and use the java command to execute your steering file.

No Format
wget http://www.lcsim.org/maven2/org/lcsim/lcsim/1.14-SNAPSHOT/lcsim-1.14-SNAPSHOT-bin.jar
java -jar ./lcsim-1.14-SNAPSHOT-bin.jar mySteeringFile.xml

The As always, the steering file must be provided by the user.