Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong n_dof for fixed unconstrained parameters in _goodness_of_fit #478

Open
lorenzennio opened this issue May 29, 2024 · 6 comments
Open

Comments

@lorenzennio
Copy link
Contributor

Hey,

I think I found a small bug in the _goodness_of_fit function, when working with fixed unconstrained parameters. When you fix an unconstrained parameter in the fit using the fix_pars argument, the n_dof is still calculated with all (fixed or free) unconstrained parameters of the model.

Code to reproduce:

import cabinetry
import pyhf

import logging
logging.basicConfig(level=logging.DEBUG)

spec = {
    "channels": [
        {
        "name": "singlechannel",
        "samples": [
            {
            "name": "signal",
            "data": [5.0, 10.0, 15.0],
            "modifiers": [
                {
                    "name": "mu",
                    "type": "normfactor",
                    "data": None
                }
            ]
            },
            {
            "name": "background",
            "data": [50.0, 60.0, 70.0],
            "modifiers": [
                {
                    "name": "mu_bkg",
                    "type": "normfactor",
                    "data": None
                }
            ]
            }
        ]
        }
    ]
    }

model = pyhf.Model(spec)

data = [55.0, 70.0, 70.0]

fixed = model.config.suggested_fixed()
fixed[model.config.par_map['mu_bkg']['slice']] = [True]
fit = cabinetry.fit.fit(model, data, fix_pars=fixed, goodness_of_fit=True)
print(fit.goodness_of_fit)

model.config.par_map['mu_bkg']['paramset'].suggested_fixed = [True]
fit = cabinetry.fit.fit(model, data, fix_pars=fixed, goodness_of_fit=True)
print(fit.goodness_of_fit)
@alexander-held
Copy link
Member

Thanks a lot for raising this! I see that model_utils.unconstrained_parameter_count already has a check for parameters set to be fixed in the model itself but that cannot catch these cases of parameters that are dynamically set to be fixed. I need to think a bit about how to best handle this. Perhaps we can calculate the difference of the sum of True values in model.config.suggested_fixed() and the fix_pars within fit.fit(), which should come out to be this difference that needs to be subtracted from n_dof.

@lorenzennio
Copy link
Contributor Author

lorenzennio commented May 29, 2024

How would we deal with fixed constrained parameters then?

Would you rather correct for this in the fit() function directly, or pass fix_pars through to model_utils.unconstrained_parameter_count as an optional argument? For the latter, one could implement a simple correction, having access to fix_pars.

@alexander-held
Copy link
Member

Good point, sounds like we should pass the information through to handle constrained parameters correctly.

@lorenzennio
Copy link
Contributor Author

What do you think of this fix?

def unconstrained_parameter_count(model: pyhf.pdf.Model, fix_pars: Optional[List[bool]] = None,) -> int:
    """Returns the number of unconstrained, non-constant parameters in a model.

    The number is the sum of all independent parameters in a fit. A shapefactor that
    affects multiple bins enters the count once for each independent bin. Parameters
    that are set to constant are not included in the count.

    Args:
        model (pyhf.pdf.Model): model to count parameters for
        fix_pars (Optional[List[bool]], optional): list of booleans specifying which
            parameters are held constant, defaults to None (use ``pyhf`` suggestion)

    Returns:
        int: number of unconstrained parameters
    """
    n_pars = 0
    # custom fixed parameters overrule suggested fixed parameters
    if fix_pars is None:
        fix_pars = model.config.suggested_fixed()
        
    for parname in model.config.par_order:
        if not model.config.param_set(parname).constrained:
        # only consider non-constant parameters
            par_slice = model.config.par_slice(parname)
            n_pars += fix_pars[par_slice].count(False)
                
    return n_pars

Can PR this, if you are happy.

@alexander-held
Copy link
Member

Thanks for opening the PR! Looking at the original implementation, I was trying to remember why I wrote it in that specific way. I cannot think of any obvious issues with what you have though. I'll have a closer look at the PR.

@lorenzennio
Copy link
Contributor Author

No worries, let me know if you want me to change anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants