-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC, WIP]: Remove evaluate kernel #2342
base: main
Are you sure you want to change the base?
Conversation
Aren't we going to take at least some amount of performance hit here on kernels you don't lazily evaluate when making predictions, because we don't get to do indexing before calling the kernel? |
gpytorch/kernels/kernel.py
Outdated
|
||
if len(named_parameters): | ||
param_names, params = zip(*named_parameters) | ||
param_batch_shapes = [self.batch_shape] * len(params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So being able to use self.batch_shape
here relies on the fact that all parameters have a fully explicit shape (rather than broadcasting over them)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think this is generally true for all kernels in GPyTorch. We can add something to the documentation to be more explicit about this.
else: | ||
res = KernelLinearOperator( | ||
x1_, x2_, covar_func=self.forward, num_outputs_per_input=num_outputs_per_input, **kwargs | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're pretty deep in the indentations here, might make sense to pull out some of the above into helper functions to make it clear what's going on here / easier to read the code
gpytorch/kernels/scale_kernel.py
Outdated
@@ -75,7 +68,7 @@ def __init__( | |||
outputscale_constraint = Positive() | |||
|
|||
self.base_kernel = base_kernel | |||
outputscale = torch.zeros(*self.batch_shape) if len(self.batch_shape) else torch.tensor(0.0) | |||
outputscale = torch.zeros(*self.batch_shape, 1, 1) if len(self.batch_shape) else torch.zeros(1, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the parameter shapes for widely used kernels like this will likely result in a bunch of downstream backward compatibility issues. I'm not necessarily suggestion not to do this, but those changes will need to be properly documented / announced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, if we're doing this, we should probably combine this with updating the shapes of the priors (c.f. #1317) to make all of that consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going back and forward about changing parameter shapes. On one hand, it could be quite the breaking change. On the other hand, it would create more consistency, and we should also be able to dramatically simplify many kernels as well.
@Balandat thoughts? I imagine this would have the biggest impact on BoTorch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, what probably makes the most sense is leaving kernel parameter shapes the way they are currently, and potentially creating consistent parameter shapes and addressing #1317 in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I think fixing the parameter shape inconsistencies and making things consistent across the board would be good, and hopefully this would also get rid of some of the long standing bugs/issues. This would impact botorch, but if we coordinate on the releases we can make sure that the effect of this is minimized. It would likely generate some short term pain for some power users (of both gpytorch and botorch) with custom setups, but I think that short term pain is probably the long-term gains from consistency.,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Balandat That sounds like a good plan. Here's what I'm thinking as a course of action:
- Make a PR that makes kernel parameters consistent sizes (and adds appropriate warnings/deprecations/etc)
- Address [Bug] Sampling from priors doesn't match shape of hyperparameters #1317
- Merge this PR
- (+ other breaking changes that we've wanted to address for a while)
After all of that, we make a major release, timed with a BoTorch release.
@jacobrgardner I don't think it'll lead to a loss in performance. This PR mostly merges LazyEvaluatedKernelTensor and KeOpsLinearOperator, since they share most of the same functionality. |
5fba78d
to
6d58b66
Compare
6d58b66
to
963c84c
Compare
**Note**: This is not a breaking change; "legacy" grids were deprecated pre v1.0.
…Kernel - The functionality of both kernels has not disappeared, but both kernels cannot work without the last_dim_is_batch_option. - The examples/00_Basic_Usage/kernels_with_additive_or_product_structure.ipynb notebook describes how to replicate the functionality of both kernels without last_dim_is_batch.
- The functionality of this kernels has not disappeared, but this kernel cannot work without the last_dim_is_batch_option. - The examples/00_Basic_Usage/kernels_with_additive_or_product_structure.ipynb notebook describes how to replicate the functionality of this kernel using the gpytorch.utils.sum_interaction_terms utility.
963c84c
to
daf43a3
Compare
With
KernelLinearOperator
defined in the linear operator package,LazyEvaluatedKernelTensor
becomes kind of redundant.This PR attempts to improve the lazy evaluation of kernels in GPyTorch, by doing the following:
LazyEvaluatedKernelTensor
, instead usingKernelLinearOperator
(which is also used by KeOps kernels)evaluate_kernel
form GPyTorch (which will be redundant withKernelLinearOperator
) and also removing it from the LinearOperator package.cc/ @Balandat