Calculate signal efficiency in training and prediction #58

aribrill · 2018-09-20T21:19:46Z

Signal efficiency (gamma efficiency / sqrt(proton efficiency)) is a useful metric for gamma/hadron classification. In run_model.py, add a custom metric to display this quantity for the validation set with Tensorboard for each of a configurable set of classification thresholds. Also add a script to plot signal efficiency vs. classification threshold given a prediction output file. Note that efficiency should take into account the initial cuts placed on the data, which will require the changes to the metadata in point 3(d) of #57.

nietootein · 2018-09-26T15:20:09Z

It would be desirable to gain insight into how the signal efficiency evolves with energy, since it may help us optimizing the classification thresholds as a function of an event's estimated energy. Even if our estimated energy will suffer from resolution and bias effects, if the classification threshold that optimizes the signal efficiency for a given energy does not substantially change in the vicinity (defined in terms of bias+resolution) of the estimated energy one may expect that dynamic classification thresholds may approximate an optimal signal efficiency. Since our data do not contain a reconstruction of the energy of the events but rather the true (MC) energy of the events one may start approaching the problem by optimizing the signal efficiency within a given (true)energy bin. Since we would like to cover the entire energy range of the instrument it would be desirable to enable an option for the plotting script (perhaps frun_model.py as well?) to take the definition of a binning (e.g. e_min, e_max, num_bins, scale={normal, log, ...}) so this signal efficiency could be computer for each bin there defined.

Visualizing a configurable set of classification thresholds may be useful, as Ari suggests, although we should also consider optimizing the signal efficiency as a function of the threshold in case the optimal threshold may not be close to any of the values in the predefined set of thresholds.

aribrill · 2018-09-27T14:06:33Z

Right now there's no easy way to get the MC energy of a given event. I think the natural way to do it is to include the MC energy in the auxiliary info returned by DataLoader.get_example() in array mode. What auxiliary info to include can be a config option.

aribrill added this to the v0.2.1 milestone Sep 21, 2018

aribrill self-assigned this Sep 21, 2018

aribrill mentioned this issue Nov 1, 2018

Define API for DataLoader output #82

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calculate signal efficiency in training and prediction #58

Calculate signal efficiency in training and prediction #58

aribrill commented Sep 20, 2018

nietootein commented Sep 26, 2018

aribrill commented Sep 27, 2018

Calculate signal efficiency in training and prediction #58

Calculate signal efficiency in training and prediction #58

Comments

aribrill commented Sep 20, 2018

nietootein commented Sep 26, 2018

aribrill commented Sep 27, 2018