h1. Processing Data in Batch Mode using LCSim XML

h2. Basics

This tutorial explains how to run [org.lcsim] in a batch computing environment, such as on a unix command line or from a shell script, which could be run on the grid.

If you have not gotten here by following the [LCSim Tutorials], then backup and read or review as necessary.

h2. Setup

Follow the instructions for [building lcsim software using maven2].

You can now run _lcsim_ from the command-line using the _java_ command.

{noformat}
java -server -jar ./target/lib/lcsim-[VERSION].jar [XML]
{noformat}

The *VERSION* is replaced by your lcsim build version.  And *XML* is a file in the lcsim recon XML format.

{noformat}
java -server -jar ./target/lib/lcsim-1.11-SNAPSHOT.jar ./myJob.xml
{noformat}

h2. LCSim XML Format

This shows all possible XML elements in the LCSim format.

{noformat}
<lcsim>
    <inputFiles>
        <fileUrl />
        <file />
    </inputFiles>
    <control>
        <logFile />
        <cacheDirectory />
        <numberOfEvents />
        <verbose />
        <printDriverStatistics />
        <printSystemProperties />
        <printUserClassPath />
        <printDriversDetailed />
    </control>
    <classpath>
        <jarUrl />
        <jar />
    </classpath>
    <execute>
        <driver name="ExampleDriver" />
    </execute>
    <drivers>
        <driver name="ExampleDriver" type="org.lcsim.example.ExampleDriver">
            <exampleParam />
        </driver>
    </drivers>
</lcsim>
{noformat}

Each of these xml sections will be explained in greater detail below.

h3. Input Files

The *<inputFiles>* section contains a list of local or remote files to be processed.

These can be *<file>* elements which contain a relative or absolute path to a file on the local file system.

{format}
<inputFiles>
    <file>/path/to/local/datafile.slcio</file>
</inputFiles>
{format}

Remote files that accessible via a public URL can be accessed using a *<fileUrl>* element.

{format}
<inputFiles>
    <fileUrl>ftp://example.org/datafile.slcio</fileUrl>
</inputFiles>
{format}

These remote files will be downloaded to the cache directory, which is *~/.cache*, by default.  A different local cache directory can be specified using the *<cacheDirectory>* tag (covered below).

The *<inputFiles>* section can contain a mixture of *<file>* and *<fileUrl>* objects.

Some batch systems may not support remote file access via URL.  Check with your administrator.

h2. Simple Example

The [JobManager|http://www.lcsim.org/software/lcsim/apidocs/org/lcsim/job/JobControlManager.html] class processes your job, which is written in an xml format.

Here is a simple example which will print the event number.

{noformat}
<lcsim>
    <inputFiles>
        <file>./myEvents.slcio</file>
    </inputFiles>
    <control>
        <numberOfEvents>100</numberOfEvents>
    </control>
    <execute>
        <driver name="EventMarkerDriver"/>
    </execute>
    <drivers>
        <driver name="EventMarkerDriver"
                type="org.lcsim.job.EventMarkerDriver">
            <eventInterval>1</eventInterval>
        </driver>
    </drivers>
</lcsim>
{noformat}

The *inputFiles* section is a list of one or more [LCIO] input *file* paths that will be processed.

The *control* section sets the jobs run parameters.  Here we set the maximum *numberOfEvents*.

The *execute* section is a list of drivers to be executed _in order_.  The *name* field of the *driver* element must correspond with a valid driver.

Finally, the *drivers* section describes the drivers that will be run on the input file.  Certain types of Driver parameters can be set in this section.  Here the interval for event printing is set as *eventInterval*, which is an integer. 

The signature for this method looks like this.

{noformat}
public void setEventInterval(int eventInterval);
{noformat}

The JobManager is able to convert from xml to these simple setters using Javabeans.  All Java primitive types are accepted, as are 1d arrays of these types.  The method must have a single argument only.

h2. Running a Specific LCSim Release

When an LCSim release is made, a zip file is created containing the LCSim jar and all its dependencies.  Running a specific version of LCSim from the command line is as simple as downloading this zip file, unzipping it, and using the java command to run the jar with your XML input.

Retrieve the dependencies jar for the version you want to run.

{noformat}
wget http://www.lcsim.org/maven2/org/lcsim/lcsim/1.4/lcsim-1.4-deps.zip
{noformat}

You can also paste this URL into your browser, and a prompt should show asking whether to download it.  (Specifics depend on your browser.)

Now, unzip the dependencies jar.  All the jars will show up in a directory called *lib/* in your current directory.

{noformat}
unzip lcsim-1.4-deps.zip
{noformat}

This uses the command line zip utility, but a zip program with a GUI such as WinZip or WinRar would work fine, too.

We're ready to run this version of lcsim.  This step requires java 1.5 or greater to be installed and accessible from your command terminal.

{noformat}
java -server -jar ./lib/lcsim-1.4.jar ./myJob.xml
{noformat}

Each release is also tagged in the cvs, like *lcsim-1_4*, so checking it out and rebuilding yourself is another possibility.  (Not covered here.)