Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving the API, how to implement constraints and extended likelihood #24

Open
maxnoe opened this issue Feb 3, 2015 · 5 comments
Open

Comments

@maxnoe
Copy link
Collaborator

maxnoe commented Feb 3, 2015

Constraints

These are nothing special i suppose, we just need an operator to join likelihoods with different data sizes.
I think the logical approach for the end user for a constraint would be something like this:

# unconstraint model for the data
x = var("x")
mu = par("mu")
sigma = par("sigma")
model = Normal(x,  mu, sigma)

# constraint to mu from other experiment:
mu_meas = var("mu_con")
sigma_meas = var("sigma_con")
constraint = Normal(mu_meas, mu, sigma_meas)

# join the models:
const_model = Join(model, constraint)

And then fit as usual

Extended Likelihood

It should be as simple as this from the user point of view:

x = var("x")
mu = par("mu")
sigma = par("sigma")
signal_model = Normal(x, mu, sigma)

tau = par("tau")
background_model = Exponential(tau)

N1 = par("N_sig")
N2 = par("N_bkg")

model = Join(N1, signal_model, N2, background_model)

Maybe we can implement the Extended likelihood over a parameter attribute? LIke
N1 = par("N1", num_events=True) would trigger an extended Likelihood?

@ibab ibab changed the title Improving the API, how to implement constraints and extenden likelihood Improving the API, how to implement constraints and extended likelihood Feb 4, 2015
@ibab
Copy link
Owner

ibab commented Feb 4, 2015

In my opinion, the most logical way to do it would be this:
Require users to apply Join to reduce a vectorized likelihood to a scalar one (This would just sum the log-likelihoods in Theano).
Then, use Join as well to combine two different models (sum their log-likelihood).
A constraint would work like this:

# unconstrained model for the data
x = var("x")
mu = par("mu")
sigma = par("sigma")
model = Normal(x,  mu, sigma)

# constraint to mu from other experiment:
mu_meas = var("mu_con")
sigma_meas = var("sigma_con")
constraint = Normal(mu_meas, mu, sigma_meas)

# Take 
const_model = Join(Join(model), constraint)

Alternatively, the two different uses of Join could have different names, like All and Join or All and And. Using the same operator makes sense, though, as it does exactly the same thing (calculate a sum) in both cases.
It would have to complain if passed a vector likelihood and a scalar likelihood at the same time.
Multiple scalars or multiple vectors would be fine, though.

I don't like the idea of implementing special mechanisms for the Extended Likelihood, as it can be interpreted as a regular MLE model.
What about implementing an operator Mix2E that can be used like this:

model = Mix2E(N1, model1, N2, model2)

This would probably have to return a scalar likelihood with the poisson constraint applied.
Are there any cases where users might want to keep the model as a vector to modify it further? I can't think of any.

@ibab
Copy link
Owner

ibab commented Feb 4, 2015

Oh, and when fit is called on a vector model, it would automatically be Joined so you wouldn't have to take this extra step for simple models.

@ibab
Copy link
Owner

ibab commented Feb 7, 2015

Another idea: It should be possible to auto-join the model.
The user would write something like this:

with Model() as model:
    # unconstrained model for the data
    x = var("x", observed=True)
    mu = var("mu")
    sigma = var("sigma")
    dist1 = Normal(x,  mu, sigma)

    # constraint to mu from other experiment:
    mu_meas = var("mu_con", observed=True)
    sigma_meas = var("sigma_con", observed=True)
    constraint = Normal(mu_meas, mu, sigma_meas)

(No explicit combination of dist and constraint is performed)
Then model.fit would be used to get the best estimate for mu and sigma.

This is very similar to what's done in pymc.

@maxnoe
Copy link
Collaborator Author

maxnoe commented Feb 7, 2015

I also thought, that one should maybe separate Model and Distribution. I do not know how one would make this work, but it looks good to me.

@KonstantinSchubert
Copy link
Collaborator

+1 on using an operator like Mix2E.

I like the use of Join() in both cases.

I agree that it should be possible to auto-join the model (passing a vector into the argument of a scalar PDF could trigger it, and fit could just receive a list of models which it then joins before fitting).

But I don't like magic. And I think a bit of verbosity on the user side helps catching mistakes and makes error messages more helpful to novices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants