Online Reconstruction Tools

Overview

For previous runs, the Monitoring Application has been used to run the reconstruction off of the ET ring and produce plots in a custom Java GUI.

This tool has some serious limitations such as:

Requiring a large amount of configuration for each reconstruction station
Not being able to scale horizontally by utilizing multiple CPUs to run reconstruction in parallel
Taking a lot of system memory per process
Being monolithic and difficult to extend/improve
Using a no longer maintained plotting backend based on JFreeChart

With these limitations in mind, a new systems was developed around components such as a server, client, and reconstruction (ET) stations. The new system provides plot data using a remote AIDA tree which can be connected to and browsed/displayed as the data is streaming from the ET ring over an RMI connection. The server manages the aggregation of plot and performance data from each reconstruction station, which runs within its own system process. A simple event bus was developed to replace the loop-based system in the old monitoring app. A client provides a command line and console interface for creating, starting, monitoring, stopping and removing stations managed by the server. Commands are sent to the server as JSON using a TCP/IP socket connection, and the server may send back a JSON response or stream data back to the client (e.g. for streaming station log data).

Histograms are booked and filled within the station's Driver code. The plots are stored in an AIDA tree which is remotely accessible (read only). The server mounts these station trees into its remote tree and performs aggregation of histograms, clouds, and profiles into a combined tree. The combined histogram data can be saved to ROOT or AIDA files.

Display clients such as JAS3 or a Java webapp can connect to the server's remote tree and view both the remote (station) plots and the combined plots in real time.

Building the Java Online Reconstruction Package

Checkout and build hps-java using:

mkdir /scratch && cd /scratch
git clone hps-java && cd hps-java && git checkout online-recon-dev
mvn clean install -DskipTests
cd online-recon

# The dir argument should be where you want the online recon scripts installed. 
mvn install -DskipTests -DskipCheckstyle -DassembleDirectory=/scratch
export PATH=/scratch/bin:$PATH
cd /scratch/

Running the Server and Client

TODO: You need to have a running ET ring with events streaming to it...

You can launch the server now using a command like:

hps-recon-server --host localhost -w $PWD/stations &

Instead of writing to your terminal, the server will create a log file at logs/server.log which you can tail to check the server's log messages.

Now you can connect to the running server using the online recon client:

hps-recon-client --host localhost

You can leave out the --host to have the server and client use the actual system name (usually equivalent to the result of the uname command on Linux).

An initial configuration can be provided to the client using the -c switch.

You can create a file called station.prop and paste this into it:

lcsim.detector=HPS-PhysicsRun2016-Pass2
lcsim.run=7798
lcsim.steering=/org/hps/steering/recon/PhysicsRun2016OnlineRecon.lcsim
lcsim.detector=HPS-PhysicsRun2016-Pass2

These settings are usually going to be specific to the year of the HPS data being read from the ET ring. The above are settings based on the 2016 physics reconstruction.

This opens the interactive online reconstruction console which can be used to configure, create, start/stop and remove stations that run the HPS physics event reconstruction on data from the ET ring.

Whenever you see online> it means the command is run in the online reconstruction console, not the system shell (like bash).

Type help into the console to show the documentation for the client command line interface:

online>help

This is the output of the above command:

GENERAL

    help - print general information
    help [cmd] - print information for specific command
    exit - quit the console

SETTINGS

    port [port] - set the server port
    host [host]- set the server hostname
    file [filename] - write server output to a file
    append [true|false] - true to append to output file or false to overwrite
    terminal - redirect server output back to the terminal

COMMANDS

    config - Set new server configuration properties
    create - Create a new station
    list - List station information in JSON format
    log - Tail log file of station (hit any key to stop tailing)
    remove - Remove a station that is inactive
    save - Save the current set of plots to a ROOT or AIDA file
    set - Set a configuration property
    shutdown - Shutdown the server
    start - Start a station that is inactive
    status - Show server and station status
    stop - Stop a station

You can set the connection settings to the server using the port and host commands, though the defaults should usually work fine.

Configuration of the online reconstruction stations can be provided using the set command, for instance:

# lcsim detector name for conditions
online>set lcsim.detector HPS-PhysicsRun2016-Pass2

# run number for conditions
online>set lcsim.run 7798

# lcsim steering resource
online>set lcsim.steering /org/hps/steering/recon/PhysicsRun2016OnlineRecon.lcsim

online>set lcsim.detector HPS-PhysicsRun2016-Pass2

FIXME: This example is very specific to 2016 data.

The steering file should contain a Driver that extends the RemoteAidaDriver, in order to provide station plot data to the server.

A single station with the above configuration can be created using:

online>create 1

The argument is the number of stations you want the server to create, which can potentially scale reliably to around the number of cores on your machine.

online>create 8

You will need to test how many stations you can run simultaneously given the specific configuration being used and the number of plots being created/updated.

The start command will start all inactive stations.

online>start

The start command can also take a list of station IDs to start:

online>start 1 2 3 ...

Many of the online recon commands are similar in that they take no arguments (usually meaning all stations or all stations in a certain state) or a space-delimited list of station IDs as in the above example.

online>status

When streaming log data from one of the stations you can hit any key on the keyboard to exit.

online>log 1

Stop all jobs:

online>stop

Remove all inactive stations (active stations need to be stopped first before they are removable).

online>remove

When you'll all done, shutdown the server using a command like:

online>shutdown 5

This will wait 5 seconds before stopping and destroying all stations and cleanly shutting down the server and its connection to the ET ring.

Example Scripts

x

Configuration Properties

x

Saving Plots

x

Creating and Accessing Plots using Remote AIDA

x

Using JAS3 to Access Remote AIDA Plots

x

Using a Java Webapp to Access Remote AIDA Plots

x

Running a Local ET Ring and Producer

Installing the ET Software

# download and untar sources
https://coda.jlab.org/drupal/system/files/et-16.1.tgz
tar -zxvf et-16.1.tgz

# fix up build files (for some reason this is needed even with Python 2?)
cd et-16.1.GIT
2to3-2.7 -w ./coda.py ./SConstruct
autopep8 -i SConstruct
autopep8 -i coda.py

# build it
export CODA=`pwd`
scons install

# setup the environment (need to do this everytime you run)
export LD_LIBRARY_PATH=${CODA}/Linux-x86_64/lib
export PATH=${CODA}/Linux-x86_64/bin:${CODA}/Linux-x86_64/bin/examples:$PATH

The scons command may not be present in your environment (installation not covered here).

Start the ET ring using a command like:

et_start -rb 8000000 -sb 8000000 -nd -p 11111 -f /tmp/ETBuffer -s 20000 -n 1000 -v -d

This starts the ET ring on port 11111 and uses a "standard" location for the swap file (buffer).

The -s argument specifics the max event size.

The -n argument is how many events can be present in the ET ring at once (???).

The other settings are mainly to try and improve the network performance of the tool by setting generous read/write buffers and using the no delay flag.

To do anything useful with this, we need to stream HPS event data onto the ET ring, which can be done using a command like:

hps-recon-producer -p 11111 -h localhost -e 1 -f /tmp/ETBuffer -l ./evio_files.txt -s 20000 -d 10

The text file contains a list of EVIO files that should be compatible with the settings on the client/server (e.g. 2016 data for the example settings from above).

Search/Navigation:

Related:

Overview

Building the Java Online Reconstruction Package