Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

Processing Data in Batch Mode using LCSim XML

Table of Contents

Overview

If you have not gotten here by following the LCSim Tutorials, then you might want to backup and review, as necessary.

...

Running from the Command Line

Follow the instructions for building lcsim software using maven2. This should result in a working lcsim setup Before starting you need to install org.lcsim on your local systemmachine.

You can now run lcsim from the command-line using the java command from the lcsim directory.

No Format

cd trunk/my/lcsim/dirdistribution # where is your lcsim?
java -server -jar ./target/lcsim-distribution-[VERSION]-bin.jar myjob.lcsim

...

The myjob.lcsim argument is an example name of a file in the lcsim reconstruction XML format.

LCSim Command Line Options

Running the jar without any arguments will print the usage instructions.

Subsequently, in this documentation, the runnable jar will be referenced to as lcsim-distribution-bin.jar but the actual jar will have the version number in it.

LCSim Command Line Options

Running the jar without any arguments will print the usage instructions.

No Format
No Format

java -jar lcsim-distribution-bin.jar [options] steeringFile.xml
usage:
 -D    Define a variable with form [name]=[value]
 -n    Set the max number of events to process.
 -p    Load a properties file containing variable definitions
 -q    Turn on quiet mode.
 -s    Set the number of events to skip.
 -v    Turn on verbose mode
 -w    Rewrite the XML file with variables resolved
 -x    Perform a dry run which does not process events

...

For instance, an LCIO input file could be defined using a variable.

No Format

<file>${inputFile}</file>

Then this file could be specified at the command line.

No Format

java -jar lcsim-distribution-bin.jar -DinputFile=myInputFile.slcio steeringFile.xml

This variable could also be set in a properties file.

No Format

java -jar lcsim-distribution-bin.jar -pmySettings.prop steeringFile.xml

The file mySettings.prop could contain the following.

No Format

inputFile=myInputFile.slcio

...

Here is a simple example which will print the event number.

No Format

<lcsim xmlns:xs="http://www.w3.org/2001/XMLSchema-instance" 
       xs:noNamespaceSchemaLocation="http://www.lcsim.org/schemas/lcsim/1.0/lcsim.xsd">
    <inputFiles>
        <file>./myEvents.slcio</file>
    </inputFiles>
    <control>
        <numberOfEvents>100</numberOfEvents>
    </control>
    <execute>
        <driver name="EventMarkerDriver"/>
    </execute>
    <drivers>
        <driver name="EventMarkerDriver"
                type="org.lcsim.job.EventMarkerDriver">
            <eventInterval>1</eventInterval>
        </driver>
    </drivers>
</lcsim>

...

The signature for this Driver method looks like this.

No Format

public void setEventInterval(int eventInterval);

...

The pseudo-XML below shows all possible elements in the LCSim format.

No Format

<lcsim xmlns:xs="http://www.w3.org/2001/XMLSchema-instance" 
       xs:noNamespaceSchemaLocation="http://www.lcsim.org/schemas/lcsim/1.0/lcsim.xsd">
    <inputFiles>
        <fileUrl />
        <file />
        <fileSet>
            <file />
        </fileSet>
        <fileList />
        <fileUrlList />
        <fileRegExp />
    </inputFiles>
    <control>
        <dryRun>true</dryRun>
        <logFile>/path/to/mylog.txt</logFile>
        <cacheDirectory>/path/to/mycache/</cacheDirectory>
        <skipEvents>1</skipEvents>
        <numberOfEvents>1000</numberOfEvents>
        <verbose>true</verbose>
        <printDriverStatistics>true</printDriverStatistics>
        <printSystemProperties>true</printSystemProperties>
        <printUserClassPath>true</printUserClassPath>
        <printDriversDetailed>true</printDriversDetailed>
    </control>
    <classpath>
        <jarUrl />
        <jar />
        <directory />
    </classpath>
    <define>
        <anExampleVariable>1234</anExampleVarible>
    </define>
    <execute>
        <driver name="ExampleDriver" />
    </execute>
    <drivers>
        <driver name="ExampleDriver" type="org.lcsim.example.ExampleDriver">
            <exampleParam>1234</exampleParam>
            <exampleArrayParam>1 2 3 4</exampleParam>
            <exampleArray2DParam>1 2 3 4; 5 6 7 8</exampleArray2DParam>
        </driver>
    </drivers>
</lcsim>

...

The <inputFiles> section contains a list of local or remote files to be processed. It may contain a mixture of any of the elements described below, but it may not be empty. And it must result in at least one input file being found or the job will fail.

file

These can be <file> elements which contain The <file> element is a relative or absolute path to a file on the local file system.

No Format

<inputFiles>
    <file>/path/to/local/datafile.slcio</file>
</inputFiles>

fileUrl

Remote files that accessible via a public URL can be accessed using a <fileUrl> elementOr it may be a publically accessible URL.

No Format

<inputFiles>
    <fileUrl>ftp<file>ftp://example.org/datafile.slcio</fileUrl>file>
</inputFiles>

Some batch systems may not support remote file access via a URL. Check with your administrator.

...

Sets of files on the local filesystem with the same base directory can be specified by using the <fileSet> element.

No Format

<fileSet baseDir="/my/data/dir">
    <file>events1.slcio</file>
    <file>events2.slcio</file>
</fileSet>

...

For instance, say that you had a local text file at /example/mylciofiles.txt containing paths to local LCIO files.

No Format

/my/data/dir/events1.slcio
/my/data/dir/events2.slcio

This can be fed into LCSim using this XML code.

No Format

<fileList>/example/mylciofiles.txt</fileList>

...

fileRegExp

The <fileUrlList> is similar to the <fileList> except it contains URL's to online data instead of paths on the local file system. For instance, the fileUrlList could point to files available via the http or ftp protocols.

Job Control

<fileRegExp> element will include files that match a regular expression.

Here is an example that would match files similar to input1.slcio, input2.slcio, etc. in the current directory.

No Format
<fileRegExp baseDir=".">input*[0-9].slcio</fileRegExp>

See http://docs.oracle.com/javase/tutorial/essential/regex/ for more information about regular expressions in Java.

Job Control

The <control> section contains parameters that control the batch The <control> section contains parameters that control the batch job, including the number of events to run and whether various debugging output should be printed.

...

The following will turn on all verbose output but turn off the printing of the system properties.

No Format

<control>
    <verbose>true</verbose>
    <printSystemProperties>false</printSystemProperties>
<control>

...

Here is an example of a simple double parameter.

No Format

<define>
    <aDoubleParam>1.1</aDoubleParam>
</define>

Variables defined here can be included in expressions by using their name.

No Format

<define>
    <aDoubleParam1>1.1</aDoubleParam1>
    <aDoubleParam2>2.2</aDoubleParam2>
    <aDoubleParam3>aDoubleParam1 + aDoubleParam2</aDoubleParam3>
</define>

...

Here is an example pointing to a (non-existant) jar at a URL.

No Format

<classpath>
    <jarUrl>http://www.example.org/something/myjar.jar</jarUrl>
</classpath>

The same thing can be done with local jar files and directories.

No Format

<classpath>
    <jar>/path/to/myjar.jar</jar>
    <directory>/path/to/myclassfiles</directory>
</classpath>

...

Here is an example Driver class with a number of setter methods.

No Format

package org.lcsim.example;

public class MyDriver
{
    public void setX(int x);
    public void setX1(int[] x1);
    public void setX2(int[][] x)2;
  
    public void setFile(File f);
    public void setUrl(URL url);
    public void setVector(Hep3Vector vec);
}

...

This is the corresponding XML code in <drivers> that would pass values to each of these methods.

No Format

<driver name="MyDriver" type="org.lcsim.example.MyDriver">
    <x>1</x>
    <x1>1 2 3</x1>
    <x2>1 2 3; 4 5 6</x2>
    <file>/path/to/a/file.txt</file>
    <url>http://example.org/file.txt</url>
    <vector>1.0 2.0 3.0</vector>
</driver>

...

  • The Driver class must be public.
  • The Driver class must have a public constructor that takes no arguments.
  • The Driver's constructor should not do any initialization. It should instead use the detectorChanged() or startOfData() methods, which are called after all input parameters are processed.
  • The set methods to be accessed in the XML should always be of the form

    No Format
    public void set[ParameterName]([type] [varName])

    Set methods not of this form will not be accessible as XML parameters.

  • The use of sub-drivers is discouraged due to these being inaccessible by the XML format, though it is still possible to use them. Any dependence of a child Driver on its parent's XML input parameters can be handled by using the startOfData() method to add a new child Driver instance.

How to Run a Specific Release

Running your job with You do not need to build lcsim yourself in order to run a specific LCSim release is straightforward. Download the bin jar from the lcsim repository, and then use the java command to execute your steering file.

No Format

wget http://www.lcsim.org/maven2/org/lcsim/lcsim/1.18-SNAPSHOT/lcsim-1.18-SNAPSHOT-bin.jar
java -jar ./lcsim-1.14-SNAPSHOT-bin.jar mySteeringFile.xml

This way of running LCSim has the potential to cause errors, e.g. if you run a steering file written for a different version where method signatures have changed or been removed or renamed..  The SLAC Nexus Repository can be searched for all lcsim-distribution releases which will display a table including downloadable links.  The bin.jar links are the runnable jars which can be downloaded to your machine and run as per the above instructions.