-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add more flexibility to visualize.pulls #342
base: master
Are you sure you want to change the base?
Conversation
andrzejnovak
commented
Jun 9, 2022
- feat: show unconstrained directly in pulls, add exclude_by_type
- fix: leave redundant styling to style sheets
- fix: pull display order
- feat: add pull(exclude=... fnmatch, fix docs
ee35ef3
to
d91baf0
Compare
Codecov Report
@@ Coverage Diff @@
## master #342 +/- ##
===========================================
- Coverage 100.00% 99.52% -0.48%
===========================================
Files 23 23
Lines 1878 1911 +33
Branches 299 311 +12
===========================================
+ Hits 1878 1902 +24
- Misses 0 4 +4
- Partials 0 5 +5
Continue to review full report at Codecov.
|
@alexander-held ready for review |
labels = model.config.par_names() | ||
_mod_dict = dict(model.config.modifiers) | ||
_clean_labels = [re.sub(r"\[.*\]", "", label) for label in labels] | ||
types = [_mod_dict[n] if n in _mod_dict else None for n in _clean_labels] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering whether we should extract and save the information about the constraint term (none, Gaussian, Poisson) instead of the modifier types. The problem with the types is that a parameter may control multiple modifiers (e.g. a normsys
plus a shapesys
). For the purpose of plotting pulls, the constraint term type is all that is needed to decide the handling.
We could get the constraint term types like this:
constraint_terms = []
for parameter in model.config.par_order:
if not model.config.param_set(parameter).constrained:
# normfactor / shapefactor
constraint_terms += [None] * model.config.param_set(parameter).n_parameters
else:
# remaining modifiers with Gaussian or Poisson constraints
constraint_terms += [
model.config.param_set("staterror_Signal_region").pdf_type
] * model.config.param_set("staterror_Signal_region").n_parameters
Going one step further, we could also save the .width()
information to evaluate constraints for Poisson-constrained parameters (since the pre-fit uncertainty for those varies per parameter).
For other use cases (like #332) it might be useful to also know all the modifier types.
While it is currently possible to determine the constraint term type from knowing the modifier, that will change when constraint terms become configurable in the future with pyhf
. So perhaps it is best to store constraint term information directly, and optionally add another field in the future to also keep track of the modifier types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am remembering now where the idea of storing the modifier types come from: that allows to exclude by type in the plot, like excluding staterror
. The constraint term type does not help there. It seems more likely that users would want to exclude by modifier type than by constraint term type, so perhaps it is best to stick with the implemented approach. The only thing that would need to be generalized is matching multiple modifier types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like the following could help simplify things by using more of the pyhf
API:
modifier_types = []
for parameter in model.config.par_order:
modifier_types += [
[
mod_type
for par_name, mod_type in model.config.modifiers
if par_name == parameter
]
] * model.config.param_set(parameter).n_parameters
ax.fill_between([-2, 2], -0.5, len(bestfit) - 0.5, color="yellow") | ||
ax.fill_between([-1, 1], -0.5, len(bestfit) - 0.5, color="limegreen") | ||
ax.vlines(0, -0.5, len(bestfit) - 0.5, linestyles="dotted", color="black") | ||
fig, ax = plt.subplots() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the figsize=(6, 1 + num_pars / 4)
scaling unless there's a strong reason not to, as that allows the spacing in the plot to be fairly consistent across a wide range of numbers of parameters.
ax.xaxis.set_minor_locator(mpl.ticker.AutoMinorLocator()) # minor ticks | ||
ax.tick_params(axis="both", which="major", pad=8) | ||
ax.tick_params(direction="in", top=True, right=True, which="both") | ||
fig.set_tight_layout(True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this removal is related to making figure customization easier via style sheets? If so, could we split that out into a separate PR? I'd like things to be more easily configurable, but I also do think the minor ticks help with legibility of constraints. Is there a way to move this into a default style sheet that users could override?
src/cabinetry/model_utils.py
Outdated
mod_type | ||
for par_name, mod_type in model.config.modifiers | ||
if par_name == parameter | ||
][:1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so for some modifiers in the example.py
this returns ['histosys', 'normsys'] so then best_fit
and types
would have different length, it's unclear to me why some parameters have more types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A parameter can control multiple modifiers, and in some specific cases a parameter can also control modifiers of different types if they have the same constraint terms. When building models, cabinetry
implements systematic uncertainties that change both shape and normalization via two correlated modifiers (normsys
and histosys
) which are controlled by the same parameter. For pulls to +/- 1 sigma this results in an equivalent model prediction to what a single histosys
modifier can provide. The split of overall normalization changes across a channel into a correlated normsys
helps protect the model predictions from becoming negative due to the exponential extrapolation used with normsys
modifiers (while histosys
is linear).