-
Notifications
You must be signed in to change notification settings - Fork 2
Tutorial
This tutorial assumes that you have a FermiLab computing account (see DUNE's atwork for more info), and have set up kerberos authentication.
Run kinit [email protected]
There are 16 GPVMs, named dunegpvm01.fnal.gov
through dunegpvm16.fnal.gov
, with no load balancing. Connect with
ssh dunegpvmXY.fnal.gov
.
Users generally have a favourite one they connect to, but to avoid walking on anybody's toes please check the activity on the GPVM you log into.
GPVMs should not be used for sustained heavy workloads, contact the dune computing team to learn how to submit jobs to the grid.
The code lives on github at DUNE/dune-tms.
cd /dune/app/users/${USER} mkdir some_project_name # optional, but may help keep things tidy :slightly_smiling_face: cd some_project_name git clone [email protected]:DUNE/dune-tms.git # if this fails, run ssh-keygen to make an ssh key cd dune-tms source setup_FNAL.sh
The current directory now contains the source code for the detector simulation and reconstruction (located in src/
), and various analysis scripts in scripts/
for high and low level performance analysis, a 2D event display, a 3D geometry viewer, and much more!
All dependencies should be provided (using spack) when you source setup_FNAL.sh
, and the only step remaining is to run
make
which builds the code and installs it into the current directory. For running the code, see the later section.
This needs to be done every time you start a new terminal.
cd your_working_directory/dune-tms source setup_FNAL.sh
Start an SL7 container using:
/cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainer shell --shell=/bin/bash -B /cvmfs,/exp,/nashome,/pnfs/dune,/opt,/run/user,/etc/hostname,/etc/hosts,/etc/krb5.conf --ipc --pid /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-dev-sl7:latest
Then instead of setup_FNAL.sh
, use setup.sh
. Everything else is the same. Make sure to make clean
away any al9 traces before trying to compile with sl7
For most studies, you can use existing tmsreco.root files provided by the production. However, that may not always be the case. The command to run both detector simulation and reconstruction is ConvertToTMSTree.exe
. It is controlled by config files in dune-tms/config
.
Warning: turn off the time slicer for the plotting scripts in scripts/Reco to run properly. Open config/TMS_Default_Config.toml and set RunTimeSlicer = false Currently there is a bug where some truth information (in particular, the true muon information) doesn't get passed on if the time slicer is run.
Here is an example of running on a single edep file.
ConvertToTMSTree.exe /pnfs/dune/persistent/users/kleykamp/example_edep_file_with_single_event_per_spill.root
To run on more files, you want to run on the grid in parallel. The official way of doing this is through the newer production code which I don't have experience with. Large samples would likely need to go through the ND production group, currently lead by Alex Booth. However, for medium-sized samples, you can use the ProcessND.py script. This script is becoming outdated but can still be used. Explanation of how to use it are on its readme and through --help
. This script can also be used to run genie and edep-sim stages, but not LAr nor caf writer.
Scripts are in
dune-tms/scripts
The .cpp scripts are out of date, but being updated by Sushil and Xiaoyan:
Assuming you have a target root file, running on the cpp files works like this:
root -l -b -q -x 'muonke.cpp("/dune/data/users/kleykamp/2023-09-15_fix_muon_ke.tmsreco.root")'
If your file is in /pnfs
, then you need to find the path using pnfs2xrootd
. See xrootd section below. If you have many individual files, see "Combining root files" section below.
On a basic level, the plotting code is taking the output of the simulation and making plots for it. It does this by looping through the provided tmsreco file, and event by event adding information to the histogram. The same tmsreco file can be reused, even when cuts are adjusted.
- Hit One scintillator that's lit up in the simulation.
- Cluster The reconstruction algorithm can group up nearby hits to create clusters. These are groups of hits that don't look track-like. Usually correspond to hits from neutrons or maybe electrons.
- Track is a reconstructed line of hits done by the reco algorithms. For the TMS, we're often trying to find the muons. The muons will usually make long tracks.
- Occupancy is the ratio of total energy on a track / total energy in the event. A track with a low occupancy is usually an indication that most of the energy is somewhere else in the event. So either it's an event with a lot of random energy, or there's a track that contains more of the visible energy.
Looking at a pseudocode breakdown of the code might help with understanding how it works:
open file and get Truth_Info and Line_Candidates TTrees turn off all root branches, and then turn only the useful ones back on. This sometimes speeds things up make histograms for i in range(n events) load event in Line_Candidates load same event in Truth_Info make sure the two trees are synchronized print status every # Cut section check that there's a muon, if not skip this event we're only using true CCmu events, check that there's a true primary muon (as opposed to muons that were created after the initial neutrino interaction). If not, skip the event make sure there's at least one reconstructed line (aka track), which might be the muon. If not, skip the event. # Adjustable cuts section check that n lines <= nLinesCut, otherwise skip event check that n clusters <= nClustersCut, otherwise skip event check that total cluster energy <= ClusterEnergyCut, otherwise skip event # Now find the best track The "Find the best track" section finds the track with the highest occupancy. The highest occupancy track is most likely the muon that we're interested in. # Also check the track with the longest track length This finds the longest track. Muons make the longest tracks compared to other particles. So, if there's a reconstructed muon, it's most likely longest track. # And also check longest track This finds the longest track of the event by using x and z corrdinates. The previous section found the longest track by density. Now it tracks how often the longest track by density (lon_trklen) is not the longest track by distance (longtrack). Makes the reco track we use, the longest track by density. So now longtrack is the index of the track we're using for our plots. # Additional cuts check if the true muon died inside the detector based on y position. If not, skip event check that longtrack stops and starts inside the detector. It first looks at the z position. There are two options. AllDet=true uses the whole detector (starting at x = 11362+55*2), while AllDet=false uses only up to x=13600. The front of the detector has thinner steel so the energy resolution is higher looking only at this region. It also checks that the last hit of the track is towards the end (above z = 18294-80*2). Muons make long tracks so tracks shorter than this are unlikely to be muons. Finally it checks the x posiitons makes sure the first and last hit of the tracks are at least 20cm inside the TMS from the sides. This is to decrease the chances of the muon leaving. Now it fills the occupancy hist Now it applies the occupancy cut. If not, skip event Now it fills the KE and KEest plots. do line fit to reco ke vs true ke plot plot histograms in pdf
- AtLeastOneLine Require that we reconstructed at least one line. Otherwise there's nothing to plot.
- CCmuOnly Look at only true CC muons, as opposed to all possible muons.
- AllDet Use whole detector or only the front region where the steel is thinner.
- nLinesCut An event with a single reco muon is going to be cleaner than an event with many tracks. So the energy resolution for events that allow only 1 reconstructed track maximum might be better at the cost of fewer muons plotted.
- nClustersCut The maximum number of clusters. More clusters usually mean a dirtier event so the energy resolution is likely less.
- ClusterEnergyCut The maximum energy in clusters. More energy in clusters usually mean a dirtier event so the energy resolution is likely less.
- OccupancyCut The minimum occupancy the reconstructed track needs to have to be plotted in the KE plots. Higher occupancy tracks are more likely to be very pure muon events with little energy lost in other processes. So this usually gives you a better true vs reco energy, at the cost of plotting fewer muons. So it's a tradeoff.
- There are also non-adjustible cuts within the for loop. Clearly Clarence thought they were needed, but their motivation should be understood.
Also working on python scripts:
python make_hists.py --help optional arguments: -h, --help show this help message and exit --outdir OUTDIR The output dir. Will be made if it doesn't exist. --name NAME The name of the output files. --indir INDIR The location of the input tmsreco files --inlist INLIST The input filelist --filename FILENAME, -f FILENAME The input file, if you have a single file --nevents NEVENTS, -n NEVENTS The maximum number of events to loop over --allow_overwrite, --no-allow_overwrite Allow the output file to overwrite --preview, --no-preview Save preview images of the histograms
And python make_plots.py --help
Example,
python make_hists.py --f /dune/data/users/kleykamp/tms_testing_files/2023-10-16_fixing_reco_everything_off_fixed_time_slicer_off_all_events.tmsreco.root --name my_file.root --allow_overwrite --preview
Default output will be in /dune/data/users/$USER/dune-tms_hists
Another thing one might do is to draw some event displays. One way to do that is with Reco/draw_spill.py
(not TimeSlicer/draw_spill.py
).
Example usage:
python draw_spill.py --input_filename /dune/data/users/kleykamp/tms_testing_files/2023-10-16_fixing_reco_everything_off_fixed_time_slicer_off.tmsreco.root --outdir 2023-10-16_fixing_reco/everything_off_only_true_tms_muons --only_true_tms_muons
We can combine root files to use more than one input file in muonke.cpp. Here's an example assuming 2023-09-15_fix_muon_ke.tmsreco.root
is your output file and you're merging all files in /pnfs/dune/persistent/users/kleykamp/nd_production_output/2023-09-15_fix_muon_ke/tmsreco/FHC/00m/00/
. The code is using the hadd
utility to add the TTrees inside each root file. Hadding ttree files can speed things up because there's less overhead. The setup
command sets up pnfs2xrootd
which isn't setup by default in the regular setup script. That's needed because you don't want to read root files directly.
setup -j duneutil v09_78_03d01 -q e20:prof hadd -f /exp/dune/data/users/kleykamp/2023-09-15_fix_muon_ke.tmsreco.root $(find /pnfs/dune/persistent/users/kleykamp/nd_production_output/2023-09-15_fix_muon_ke/tmsreco/FHC/00m/00/ -name "*root" -exec pnfs2xrootd {} \;)
Authentication for
kx509 voms-proxy-init --noregen -rfc -voms dune:/dune/Role=Analysis
Using xrootd is super important. It prevents overloading of the servers. You can use pnfs2xrootd <filename>
. First set up pnfs2xrootd
by loading duneutil
, which isn't set up by default.
setup -j duneutil v09_78_03d01 -q e20:prof pnfs2xrootd /pnfs/dune/persistent/users/kleykamp/nd_production_output/2023-09-15_ fix_muon_ke/tmsreco/FHC/00m/00/neutrino.0_1671124115.tmsreco.root # output root://fndca1.fnal.gov:1094//pnfs/fnal.gov/usr/dune/persistent/users/kleykamp/nd_production_output/2023-09-15_fix_muon_ke/tmsreco/FHC/00m/ 00/neutrino.0_1671124115.tmsreco.root
There's also samweb2xrootd
if you have a file in sam. See this issue for more information.
Simple example files edep files which are inputs to ConvertToTMSTree (these cannot be used for the plotting scripts above, which expect tmsreco.root files). The first has 1 event per spill. The second has an overlay simulation without rock samples.
/pnfs/dune/persistent/users/kleykamp/example_edep_file_with_single_event_per_spill.root
/pnfs/dune/persistent/users/kleykamp/example_edep_file_with_overlay.root
To run more, you probably should ask production. There is this script to do it manually if it's a medium amount. Here are some files to to run over.
/pnfs/dune/persistent/users/kleykamp/nd_production_output
This one has decent statistics with 1 event per spill.
/pnfs/dune/persistent/users/kleykamp/nd_production_output/2022-12-15_simple_spill/edep
Also,
/pnfs/dune/persistent/users/marshalc/LArTMSProductionJun23withLArCV/edep/FHC/00m/00