Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC, WIP]: Remove evaluate kernel #2342

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft

Conversation

gpleiss
Copy link
Member

@gpleiss gpleiss commented May 13, 2023

With KernelLinearOperator defined in the linear operator package, LazyEvaluatedKernelTensor becomes kind of redundant.

This PR attempts to improve the lazy evaluation of kernels in GPyTorch, by doing the following:

  • Remove LazyEvaluatedKernelTensor, instead using KernelLinearOperator (which is also used by KeOps kernels)
  • Removing evaluate_kernel form GPyTorch (which will be redundant with KernelLinearOperator) and also removing it from the LinearOperator package.

cc/ @Balandat

@jacobrgardner
Copy link
Member

Aren't we going to take at least some amount of performance hit here on kernels you don't lazily evaluate when making predictions, because we don't get to do indexing before calling the kernel?

gpytorch/kernels/kernel.py Outdated Show resolved Hide resolved
gpytorch/kernels/kernel.py Show resolved Hide resolved

if len(named_parameters):
param_names, params = zip(*named_parameters)
param_batch_shapes = [self.batch_shape] * len(params)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So being able to use self.batch_shape here relies on the fact that all parameters have a fully explicit shape (rather than broadcasting over them)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think this is generally true for all kernels in GPyTorch. We can add something to the documentation to be more explicit about this.

else:
res = KernelLinearOperator(
x1_, x2_, covar_func=self.forward, num_outputs_per_input=num_outputs_per_input, **kwargs
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're pretty deep in the indentations here, might make sense to pull out some of the above into helper functions to make it clear what's going on here / easier to read the code

@@ -75,7 +68,7 @@ def __init__(
outputscale_constraint = Positive()

self.base_kernel = base_kernel
outputscale = torch.zeros(*self.batch_shape) if len(self.batch_shape) else torch.tensor(0.0)
outputscale = torch.zeros(*self.batch_shape, 1, 1) if len(self.batch_shape) else torch.zeros(1, 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the parameter shapes for widely used kernels like this will likely result in a bunch of downstream backward compatibility issues. I'm not necessarily suggestion not to do this, but those changes will need to be properly documented / announced.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, if we're doing this, we should probably combine this with updating the shapes of the priors (c.f. #1317) to make all of that consistent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going back and forward about changing parameter shapes. On one hand, it could be quite the breaking change. On the other hand, it would create more consistency, and we should also be able to dramatically simplify many kernels as well.

@Balandat thoughts? I imagine this would have the biggest impact on BoTorch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, what probably makes the most sense is leaving kernel parameter shapes the way they are currently, and potentially creating consistent parameter shapes and addressing #1317 in a separate PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I think fixing the parameter shape inconsistencies and making things consistent across the board would be good, and hopefully this would also get rid of some of the long standing bugs/issues. This would impact botorch, but if we coordinate on the releases we can make sure that the effect of this is minimized. It would likely generate some short term pain for some power users (of both gpytorch and botorch) with custom setups, but I think that short term pain is probably the long-term gains from consistency.,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Balandat That sounds like a good plan. Here's what I'm thinking as a course of action:

After all of that, we make a major release, timed with a BoTorch release.

@gpleiss
Copy link
Member Author

gpleiss commented May 25, 2023

Aren't we going to take at least some amount of performance hit here on kernels you don't lazily evaluate when making predictions, because we don't get to do indexing before calling the kernel?

@jacobrgardner I don't think it'll lead to a loss in performance. This PR mostly merges LazyEvaluatedKernelTensor and KeOpsLinearOperator, since they share most of the same functionality.

**Note**: This is not a breaking change; "legacy" grids were deprecated
pre v1.0.
…Kernel

- The functionality of both kernels has not disappeared, but both
  kernels cannot work without the last_dim_is_batch_option.
- The examples/00_Basic_Usage/kernels_with_additive_or_product_structure.ipynb
  notebook describes how to replicate the functionality of both kernels
  without last_dim_is_batch.
- The functionality of this kernels has not disappeared, but this
  kernel cannot work without the last_dim_is_batch_option.
- The examples/00_Basic_Usage/kernels_with_additive_or_product_structure.ipynb
  notebook describes how to replicate the functionality of this kernel
  using the gpytorch.utils.sum_interaction_terms utility.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

3 participants