-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compatibility with Flux #396
Comments
Would compatible with |
Typically no, |
Just having completed the upsampling code: I would now rather use KernelAbstractions.jl for such work, as it saves you from writing almost the same code twice. Plus I think it's a much lighter dependency than CUDA. In the end I imagine having super portable (CPU, GPU (Nvidia & AMD!)) code with gradients, which can be recycled in julia math, images and nnlib. It could even be used to write specialized methods for different dimension orderings and dispatch via NamedDims. |
Related to the gradient computation, it would be helpful to have a |
Could you point to the gradient function? |
I don't know too much about the code, but simple debugging tells me there are 6 gradient functions These have special definitions
These call the general gradient functions
I suspect the indexing.jl one is the one which needs to be used here. |
Found it! Interpolations.jl/src/Interpolations.jl Line 413 in ec35013
|
Now that #414 is finished, how do we rewrite the upsampling functions? Going through the NNLib code, it seems like now that we have chainrules integration we can just rewrite stuff like upsample_nearest and its gradient. If its just that, I will open a PR their. |
Have you taken a look at http://juliamath.github.io/Interpolations.jl/latest/devdocs/ yet? |
I have read that page and I think I get it, a bit. But I was wondering if we should rather write them as simple functions using interpolation objects. Like the upsample_nearest is something like itp = interpolate(x,(NoInterp(),NoInterp())) We already have parent axes, which can be used to determine size. |
Would it work similarly to the kernels we have now in NNlib? Do we get the correct gradients out etc? Worth doing a comparison imo |
I dont know what the kernel system is in NNLib (not familiar with that codebase at all). But I was thinking I can replace the code for upsample_nearest and its gradient function to the Zygote call. They have tests for these functions so rewriting should be easier. |
Hi! @DhairyaLGandhi this serves as a reply to FluxML/NNlibCUDA.jl#5 (comment), but I think it might be better to keep the discussion in one place :) Yes, I would find it cool if it works and we can get rid of duplicate code (of which I really wrote a lot, sorry for that). But I don't know if the performance can match the handwritten kernels I ported from pytorch (at least for now). For example I benchmarked the CPU implementation of bilinear upsampling against
I don't know either 😄 I just ported pytorch code brute force and it turned out to be quite fast, but also a bit unwieldy, which can probably be improved by properly leveraging the dispatch. The nearest neighbour part was written by @mcabott. Lastly, maybe someone else can also benchmark the interpolations to check my results? |
Pretty confused as to the current state of this. I'm trying to run the same example from #414 but have the following error using Interpolations
using Zygote
y = sin.(1:10)
itp = interpolate(y,BSpline(Cubic(Reflect(OnCell()))))
Zygote.gradient(itp, 2)
Did something change in this package or Zygote? |
Ah I see, that merge hasn't made it to a release yet. Disregard me. |
Sorry to revive this - but the above example does not work when I run it despite the previous merge. Further, when I run the package tests in a fresh julia installation (@v1.7) pkg> add Interpolations
(@v1.7) pkg> test Interpolations on the latest version (0.13.4) I find the error:
for the test related to the previously discussed Zygote gradients issue. It seems like the problem is ChainRulesCore? (I also find a many-times repeated error when running the tests in test/gradient.jl:216, but the place for discussing this is probably another issue.) |
Have you tried this on 1.6? There are known inference issues on 1.7 |
The example from above reproduces this error. julia> using Interpolations, Zygote
julia> y = sin.(1:10);
julia> itp = interpolate(y,BSpline(Cubic(Reflect(OnCell()))));
julia> z, back = Zygote.pullback(itp, 2.0);
julia> z
0.769963450028415
julia> back(1.0)
([-0.35017548837401463],)
julia> Zygote.gradient(itp, 2.0);
ERROR: MethodError: no method matching (::ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}})(::StaticArrays.SVector{1, Float64})
(jl_hY83L3) pkg> st
Status `/private/var/folders/yq/4p2zwd614y59gszh7y9ypyhh0000gn/T/jl_hY83L3/Project.toml`
[a98d9a8b] Interpolations v0.13.4
[e88e6eb3] Zygote v0.6.32 The error is coming from ChainRulesCore, but only because it's performing a sanity-check on the gradient returned by the rule. Above, Where the bug is, or whether this used to work, I don't know. |
Oh, the problem is just that there has not been a release since #465. On master this works. |
Just switching to master worked for me - thanks for pointing that out and for the explanation! |
I just released a new version 0.13.5 |
I'd love to have the gradient wrt to the original array as well, and at that point I believe we could close this. Is there some interest in putting together a PR to that end? |
@DhairyaLGandhi, me too, but implementing that seems quite a bit harder, since it has to be done separately for each type of interpolation. One simple case that might be a good starting point would be to support linear gridded interpolation. |
@DhairyaLGandhi @stevengj I would also love to have the gradient with respect to the data being interpolated. Especially for cubic splines, because I need smoothness... The problem is that I have no clue where to start implementing it. Does the interpolation involve solving equations, or are there some constant weights pre-computed? Also there is very little online about how cubic spline interpolation works in 2D or 3D. Any pointer to resources would be helpful. Is there any other package that could do this? |
@DhairyaLGandhi @stevengj : Internally, the gradient is calculated by the I would be happy if 'weightedindexes' is exported as an API and then I could use it as the gradient. Could someone make that happen? |
Hi,
this is to enquire about the possibility of using Interpolations.jl to build image upsampling or downsampling layers in Flux.jl.
We recently added a bilinear upsampling function FluxML/NNlib.jl#262, but I was wondering if we could leverage instead some of the code here. The requisites would be
I'm not familiar with the codebase here, maybe this is a long shot, but worth making an attempt and creating awareness about the possibility of this kind of interaction.
Moreover gpu and automatic diff friendly interpolations would generally benefit the ML ecosystem.
Best,
CL
cc @johnnychen94
The text was updated successfully, but these errors were encountered: