Useful links
Technical
- SDF guide and documentation, particularly on using Jupyter notebooks interactively or through web interface (runs on top of nodes managed by SLURM)
- Training dataset dumper (used for producing h5 files from FTAG derivations) documentation and git (Prajita's fork, bjet_regression is the main branch)
- SALT documentation, SALT on SDF, puma git repo (used for plotting), and Umami docs (for postprocessing)
- SLAC GitLab group for plotting related code
- FTAG1 derivation definition (FTAG1.py)
Documents and notes
- GN1 June 2022 PUB note, nice slides from A. Duperrin
- Jannicke's thesis (chapter 4 on b-jets)
Presentations and meetings
- See all B-jet calibration meetings on Indico
- Framework experience (Prajita, July 6)
- Plans (Prajita, July 13)
- What needs to be added to JETM2 (August 17)
SDF preliminaries
An environment needs to be created to ensure all packages are available. We have explored some options for doing this.
Option 1: stealing instance built for SSI 2023. This installs most useful packages but uses python 3.6, which leads to issues with h5py.
Starting Jupyter sessions via SDF web interface
- SDF web interface > My Interactive Sessions > Services > Jupyter (starts a server via SLURM)
- Jupyter Instance > slac-ml/SSAI
Option 2: create your own conda environment. Follow the SDF docs to use ATLAS group installation of conda.
There is also a slight hiccup with permissions in the folder /sdf/group/atlas/sw/conda/pkgs, which one can sidestep by specifying their own folder for saving packages (in GPFS data space).
The TLDR is:
export PATH="/sdf/group/atlas/sw/conda/bin:$PATH" conda init # the previous will be added to your bashrc file # Add the following lines to ~/.condarc file (create default file with conda config) pkgs_dirs: - /gpfs/slac/atlas/fs1/d/<user>/conda_envs conda env create -f bjr_v01.yaml # for example(bjr_v01) conda install jupyter
This env can be activated when starting a kernel in Jupyter by adding the following under Custom Conda Environment:
export CONDA_PREFIX=/sdf/group/atlas/sw/conda export PATH=${CONDA_PREFIX}/bin/:$PATH source ${CONDA_PREFIX}/etc/profile.d/conda.sh conda env list conda activate bjr_v01
Producing H5 samples
We are using a custom fork of training-dataset-dumper, developed for producing h5 files for NN training based on FTAG derivations.
The custom fork is modified to store the truth jet pT via AntiKt4TruthDressedWZJets container.
The main pieces of code that we are developing for the dumper are:
- Component accumulator: BTagTrainingPreprocessing/bin/ca-dump-jer
- Configuration file: configs/ftag_jer.json
After compilation (note that one must build on Athena rather than AnalysisBase, as per the advanced usage page), you can run a local test with:
ca-dump-jer -c configs/ftag_jer.json <test_FTAG1_file>
Some additional truth information for the tracks will be useful. The addition of fields under the decorate
part of the config didn't work with FTAG derivations. Some predefined decorations are in TDD docs.
- Add truthTypeLabel (NoTruth=0, Other=1, Pion=2, Kaon=3, Electron=4, Muon=5, Photon=6)
A set of test mc21 FTAG1 files can be found on:
/gpfs/slac/atlas/fs1/d/pbhattar/BjetRegression/InputDAOD
The current set of available Ntuples is available on:
/gpfs/slac/atlas/fs1/d/pbhattar/BjetRegression/Input_Ftag_Ntuples ├── Rel22_ttbar_AllHadronic ├── Rel22_ttbar_DiLep └── Rel22_ttbar_SingleLep
Analyzing H5 samples
Notebooks
- Chunking h5 files: /sdf/home/b/bbullard/bjes/analysis/ChunkH5.ipynb
Miscellaneous tips
You can grant read/write access for GPFS data folder directories to ATLAS group members via the following (note that this does not work for SDF home folder)
groups <username> # To check user groups cd <your_directory> find . -type d|xargs chmod g+rx # Need to make all subdirectories readable *and* executable to the group