-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to combine filters for Samples and Regions #307
Comments
Hi Nicole, Thanks for bringing this up! The current limitation is indeed quite limiting, and it would be great to have something a bit more flexible. The general idea behind the implementation of the filter is to have something flexible that does not require a lot of different configuration options to work. At the moment it works like this:
The one convenient thing about this is that there's only a single option, and (I believe) this is in principle general enough to implement any use case, albeit it can get very clunky. If you have for example an extra filter you want to apply per sample, and also multiple regions with their own filter, you would have to split the sample up into many different samples (one per region) and implement the combined region + sample filter for each of the new per-region versions of the sample. The behavior at the moment is strictly overriding the higher-level filter, with no possibility to append. I think for practical use cases it would be useful for systematics to be able to both override filters and append to them, and similarly for all things at a lower level (general -> region -> sample -> systematic) to both be able to override and append. Here are some thoughts of what could be useful to have:
I am not sure how to best go about providing something flexible that can still be reasonably easy to understand and does not require a ton of different settings to remember. Any ideas for a design here are much appreciated! For the specific case outlined above, we could think about splitting To also support appending to a filter (e.g. at the systematic level), I think a slightly different approach might be needed. With YAML configs there's probably some way with anchors to do that, and it could also be done when building the config with Python. Maybe there's a better way though? |
One way to implement this that may be quite general and possibly also fairly readable is the following. We stick with a single option That would allow users to specify themselves filters that can be overridden (and overriding then happens by using the same key again later on), and to append to existing filters (using a new key). Here is a schematic example: General:
Filter: {"general": "jet_pt > 50"}
Region:
- Name: "SR"
Filter: {"SR_filter": "nLeptons == 4"}
Sample:
- Name: "ttbar"
Filter: {"ttbar_filter": "DSID == 410000", "general": "jet_pt > 40"}
- Name: "Z+jets" The actual strings used here for the keys in the dictionaries don't matter, the only thing that matters is whether two keys are the same or not. For Edit: the dictionaries can also be written differently, e.g. like this Filter:
ttbar_filter: "DSID == 410000"
general: "jet_pt > 40" which may further enhance readability. This approach clashes in one aspect with the current config design: at the moment, keys are known in advance and specified by the JSON schema. This would allow the introduction of arbitrary keys for this specific option. To avoid this, the filter could be a list of dictionaries, with each dictionary in that list having a Filter:
- Name: "ttbar_filter"
Filter: "DSID == 410000"
- Name: "general"
Filter: "jet_pt > 40" That results in a more consistent format at the cost of being more verbose. |
@nhartman94 #310 implements a first attempt at a dict-based approach. |
Hi @alexander-held ,
I watched your pyhep tutorial recently, and was super enthusiastic by how streamlined this package is!
I started playing with it for my analysis, and I wanted to try a study where I wanted to apply some different filters for the individual samples as well as an analysis categorization to define a regions.
I can see where to modify the
_filter
class to implement an "&" the two filters if both the Region and Sample Filters are provided, but I was wondering, would it maybe be useful to submit a pull request in case others are interested in this functionality in the future? Or am I not correctly understanding how I should be using these Filters?Thanks so much!
Best,
Nicole
The text was updated successfully, but these errors were encountered: