-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some questions about the API #29
Comments
I've not much to do with the package, but these all sound like good generalisations
tbh i thought this one was already the case (it seems it might be for Line 42 in 7a62068
|
Ah interesting. I wonder how that happened! Though that method is still slightly more restrictive than we would like. Concretely, we would be after something like the following being permissible: fit(::Template, x::AbstractVector, y::AbstractVector)
predict(::Model, x::AbstractVector) |
I like the move from
That happened in some early model development when the Model.jl API was not very well-defined/restrictive. |
Sorry for the slow response @wytbella -- had to think a bit about the output stuff.
To be clear, I'm not suggesting dispensing with the current API -- presumably that would break lots of existing code unnecessarily. I would prefer non-breaking extensions where possible. I think that extending the API to officially allow for
num_observations indeed.
In this particular instance, probably. In general, the approach we take in JuliaGPs is to avoiding stating what types each element of the inputs has to be. So
I've been thinking about this a bit, and I think it might just be easier for now to leave this aspect of the API.
Yes, this is also my understanding. We could certainly make the things that JuliaGPs provides implement this requirement for multi-output GPs, it's just a bit restricting.
This is certainly one option. I think it's probably a good one. IIRC the need for this has been discussed before...
I think this is the crux of the reason that I'm happy to not try and get the JuliaGPs way of doing multi-output things adopted here (but I would of course be happy to do so if it's something that people like the idea of doing!). The way that we handle this in JuliaGPs is explained here -- essentially we convert all multi-output GPs into single-output GPs by extending their inputs to also contain an integer saying which output a given input corresponds to. It seems to work remarkably well for internal stuff, because people building stuff on top of the API don't immediately have to care whether they're dealing with a single-output or multi-output GP, and it means that we don't have to do anything special (in the API) to handle situations when you only get one observation per output at each "input" (the term we use for this is heterotopic, because it seems to be used elsewhere). Whether it's what you want for user-facing stuff is less clear to me (it would certainly work, but whether the most intuitive thing if you're working with data data that is essentially always vector-valued is less clear to me. I think people tend to think in terms of vector-valued outputs in that context, which is presumably the reason for the current API).
Ahh, I see. Good to know the history. |
Thanks for the explanation @willtebbutt ! I think I'm on board with the For And just to double check that my understanding is correct: If way we predict some multi-output variables for each, e.g. hour, independently, the lengths of inputs/outputs are both num_hours? If we want to predict multiple hours jointly, we have to edit the inputs as well to ensure the length of inputs and outputs match, right? |
Excellent -- I'll open a PR to update docs and test utils.
Not quite. The proposal would be that the length of the inputs is always equal to I think it would be best to hold off doing this for now though, I agree. It's one of those APIs that seems to work really well for internals, and lets you express all of the things you might want to express, but isn't the most immediately intuitive thing. |
The JuliaGPs org is trying to figure out how best to provide a high-level front-end for our GPs -- currently they're useful for researchers and people who know a bit more about GPs, but we've not built functionality which lets people just call "fit" and expect something sensible to happen.
We're investigating all of the ML frameworks that we can find in the Julia ecosystem to figure out which ones are likely to work for us (see e.g. https://github.com/willtebbutt/MLJAbstractGPsGlue.jl/). We might pick one, or we might pick a couple if there's a good reason to do so.
To that end, I have a couple of API-related questions, to try and establish where there is / is not flexibility in the current Models.jl API:
AbstractVector
s of the same length as the number of outputs. For an explanation of this, see our API docs and design discussion docs. The Models.jl API presently requires that inputs areAbstractMatrix
s. Would it be possible to generalise this toAbstractVector
s?predict
must be a vector of distributions. Often, we're interested in joint predictions in JuliaGPs, so it would be nice to return a single distribution object when someone callspredict
, which represents the joint distribution over the predictions at all locations requested by the user.None of these are show-stoppers for us, but it would be good to know how set-in-stone they are.
The text was updated successfully, but these errors were encountered: