compatibility with Flux #396

CarloLucibello · 2021-01-08T09:14:45Z

Hi,
this is to enquire about the possibility of using Interpolations.jl to build image upsampling or downsampling layers in Flux.jl.
We recently added a bilinear upsampling function FluxML/NNlib.jl#262, but I was wondering if we could leverage instead some of the code here. The requisites would be

handle batch dimension
gradient computation
compatible with CuArrays

I'm not familiar with the codebase here, maybe this is a long shot, but worth making an attempt and creating awareness about the possibility of this kind of interaction.
Moreover gpu and automatic diff friendly interpolations would generally benefit the ML ecosystem.

Best,
CL

cc @johnnychen94

mkitti · 2021-01-08T14:34:47Z

Would compatible with CuArrays create a dependency on CUDA?

CarloLucibello · 2021-01-08T14:41:05Z

Typically no, CuArrays just follow abstract array interface, although scalar indexing leads to horrible performance.
To address such cases a cuda kernel is needed (and the CUDA.jl dependence)

maxfreu · 2021-02-05T16:10:37Z

Just having completed the upsampling code: I would now rather use KernelAbstractions.jl for such work, as it saves you from writing almost the same code twice. Plus I think it's a much lighter dependency than CUDA. In the end I imagine having super portable (CPU, GPU (Nvidia & AMD!)) code with gradients, which can be recycled in julia math, images and nnlib. It could even be used to write specialized methods for different dimension orderings and dispatch via NamedDims.

moesphere · 2021-03-15T11:06:28Z

Related to the gradient computation, it would be helpful to have a rrule and frule function defined, package ChainRulesCore, to be able to use Interpolations with AD, e.g. Zygote. A gradient function is already defined. Thus the only part missing that needs to be implemented is a gradient with respect to the fields of Interpolations objects, if I understand it correctly, see here

DhairyaLGandhi · 2021-04-08T11:37:29Z

Could you point to the gradient function?

rick2047 · 2021-04-08T12:25:13Z

I don't know too much about the code, but simple debugging tells me there are 6 gradient functions

These have special definitions

Indexing.jl x2
monotonic.jl

These call the general gradient functions

Interpolations.jl
extrapolation.jl
scaling.jl

I suspect the indexing.jl one is the one which needs to be used here.

DhairyaLGandhi · 2021-04-08T19:32:57Z

Found it!

Interpolations.jl/src/Interpolations.jl

Line 413 in ec35013

function gradient(itp::AbstractInterpolation, x::Vararg{UnexpandedIndexTypes})

rick2047 · 2021-04-17T20:42:18Z

Now that #414 is finished, how do we rewrite the upsampling functions? Going through the NNLib code, it seems like now that we have chainrules integration we can just rewrite stuff like upsample_nearest and its gradient. If its just that, I will open a PR their.

mkitti · 2021-04-17T21:50:44Z

Have you taken a look at http://juliamath.github.io/Interpolations.jl/latest/devdocs/ yet?

rick2047 · 2021-04-18T09:09:34Z

I have read that page and I think I get it, a bit. But I was wondering if we should rather write them as simple functions using interpolation objects. Like the upsample_nearest is something like

itp = interpolate(x,(NoInterp(),NoInterp()))

We already have parent axes, which can be used to determine size.

DhairyaLGandhi · 2021-04-18T09:24:41Z

Would it work similarly to the kernels we have now in NNlib? Do we get the correct gradients out etc?

Worth doing a comparison imo

rick2047 · 2021-04-18T09:28:26Z

I dont know what the kernel system is in NNLib (not familiar with that codebase at all). But I was thinking I can replace the code for upsample_nearest and its gradient function to the Zygote call. They have tests for these functions so rewriting should be easier.

maxfreu · 2021-04-28T17:12:02Z

Hi! @DhairyaLGandhi this serves as a reply to FluxML/NNlibCUDA.jl#5 (comment), but I think it might be better to keep the discussion in one place :)

Yes, I would find it cool if it works and we can get rid of duplicate code (of which I really wrote a lot, sorry for that). But I don't know if the performance can match the handwritten kernels I ported from pytorch (at least for now). For example I benchmarked the CPU implementation of bilinear upsampling against imresize! in ImageTransformations.jl (which uses interpolations.jl under the hood) and NNlib was 2.2x faster, even single threaded and with sub-optimal memory layout. Furthermore the ImageTransformations.jl code doesn't work on gpus when scalar getindex is disallowed. So I don't really know what the best way forward is. Maybe to enhance interpolations.jl and then use it? But how should all the specialized GPU code be handled, like the setup for the kernel calls (setting thread & block size)? Create CUDAinterpolations.jl so that interpolations.jl doesn't have to depend on CUDA? Or simply proceed with collecting specialized code for neural networks / dl in NNlib and just pick the stuff which really makes things faster and simpler from other packages? Note that at least for upsampling / interpolations things can be simplified a lot by smarter dispatch, which I can maybe hand in two PRs down the line or so.

@rick2047

I dont know what the kernel system is in NNLib

I don't know either 😄 I just ported pytorch code brute force and it turned out to be quite fast, but also a bit unwieldy, which can probably be improved by properly leveraging the dispatch. The nearest neighbour part was written by @mcabott.

Lastly, maybe someone else can also benchmark the interpolations to check my results?

kiranshila · 2021-05-26T00:10:20Z

Pretty confused as to the current state of this. I'm trying to run the same example from #414 but have the following error

using Interpolations
using Zygote
y = sin.(1:10)
itp = interpolate(y,BSpline(Cubic(Reflect(OnCell()))))
Zygote.gradient(itp, 2)

ERROR: ArgumentError: unable to check bounds for indices of type Interpolations.WeightedAdjIndex{4, Ratios.SimpleRatio{Int64}}

Did something change in this package or Zygote?

kiranshila · 2021-05-26T00:19:22Z

Ah I see, that merge hasn't made it to a release yet. Disregard me.

jmsull · 2021-12-14T03:15:29Z

Sorry to revive this - but the above example does not work when I run it despite the previous merge. Further, when I run the package tests in a fresh julia installation

(@v1.7) pkg> add Interpolations
(@v1.7) pkg> test Interpolations

on the latest version (0.13.4) I find the error:

ChainRulesCore: Error During Test at /Users/jsull/.julia/packages/Interpolations/3gTQB/test/chainrules.jl:12
  Test threw exception
  Expression: (Zygote.gradient(itp, 1))[1] == Interpolations.gradient(itp, 1)
  MethodError: no method matching (::ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}})(::SVector{1, Float64})
  Closest candidates are:
    (::ChainRulesCore.ProjectTo{T})(::ChainRulesCore.AbstractZero) where T at ~/.julia/packages/ChainRulesCore/sHMAp/src/projection.jl:120
    (::ChainRulesCore.ProjectTo{<:Number})(::ChainRulesCore.Tangent{<:Number}) at ~/.julia/packages/ChainRulesCore/sHMAp/src/projection.jl:192
    (::ChainRulesCore.ProjectTo{T})(::ChainRulesCore.Tangent{<:T}) where T at ~/.julia/packages/ChainRulesCore/sHMAp/src/projection.jl:142
    ...

for the test related to the previously discussed Zygote gradients issue. It seems like the problem is ChainRulesCore?

(I also find a many-times repeated error when running the tests in test/gradient.jl:216, but the place for discussing this is probably another issue.)

mkitti · 2021-12-14T04:25:57Z

Have you tried this on 1.6? There are known inference issues on 1.7

mcabbott · 2021-12-14T04:33:16Z

The example from above reproduces this error.

julia> using Interpolations, Zygote
julia> y = sin.(1:10);
julia> itp = interpolate(y,BSpline(Cubic(Reflect(OnCell()))));

julia> z, back = Zygote.pullback(itp, 2.0);

julia> z
0.769963450028415

julia> back(1.0)
([-0.35017548837401463],)

julia> Zygote.gradient(itp, 2.0);
ERROR: MethodError: no method matching (::ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}})(::StaticArrays.SVector{1, Float64})

(jl_hY83L3) pkg> st
Status `/private/var/folders/yq/4p2zwd614y59gszh7y9ypyhh0000gn/T/jl_hY83L3/Project.toml`
  [a98d9a8b] Interpolations v0.13.4
  [e88e6eb3] Zygote v0.6.32

The error is coming from ChainRulesCore, but only because it's performing a sanity-check on the gradient returned by the rule. Above, back(1.0) avoids this, and it produces a 1-element vector as the gradient of a number, which doesn't make sense.

Where the bug is, or whether this used to work, I don't know.

mcabbott · 2021-12-14T04:39:09Z

Oh, the problem is just that there has not been a release since #465. On master this works.

jmsull · 2021-12-14T05:12:19Z

Just switching to master worked for me - thanks for pointing that out and for the explanation!

mkitti · 2021-12-14T06:33:35Z

I just released a new version 0.13.5

DhairyaLGandhi · 2021-12-14T13:55:02Z

I'd love to have the gradient wrt to the original array as well, and at that point I believe we could close this. Is there some interest in putting together a PR to that end?

stevengj · 2022-11-01T20:58:58Z

@DhairyaLGandhi, me too, but implementing that seems quite a bit harder, since it has to be done separately for each type of interpolation. One simple case that might be a good starting point would be to support linear gridded interpolation.

rs1909 · 2022-11-16T16:51:30Z

@DhairyaLGandhi @stevengj I would also love to have the gradient with respect to the data being interpolated. Especially for cubic splines, because I need smoothness... The problem is that I have no clue where to start implementing it. Does the interpolation involve solving equations, or are there some constant weights pre-computed? Also there is very little online about how cubic spline interpolation works in 2D or 3D. Any pointer to resources would be helpful. Is there any other package that could do this?

rs1909 · 2022-11-17T16:44:01Z

@DhairyaLGandhi @stevengj :
Gradient with respect to the image is after all not that complicated. The interpolation linearly depends on the image, so this gradient is in fact independent of the image.

Internally, the gradient is calculated by the weightedindexes function and then multiplied with the image data. This has a sparse format, because the interpolation result depends only locally on the image. This is the same internal API all throughout the various interpolations. The only task is to convert the tuple of 'WeightedAdjIndex' to a digestible format, which I can do.

I would be happy if 'weightedindexes' is exported as an API and then I could use it as the gradient. Could someone make that happen?

maxfreu mentioned this issue Feb 15, 2021

Hint: NNlib now contains code for nearest and bilinear upsampling JuliaImages/ImageTransformations.jl#113

Open

rick2047 mentioned this issue Apr 14, 2021

Compatibility with ChainRules, Drop Julia 1.0 support #414

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compatibility with Flux #396

compatibility with Flux #396

CarloLucibello commented Jan 8, 2021

mkitti commented Jan 8, 2021

CarloLucibello commented Jan 8, 2021

maxfreu commented Feb 5, 2021 •

edited

Loading

moesphere commented Mar 15, 2021

DhairyaLGandhi commented Apr 8, 2021

rick2047 commented Apr 8, 2021

DhairyaLGandhi commented Apr 8, 2021

rick2047 commented Apr 17, 2021

mkitti commented Apr 17, 2021

rick2047 commented Apr 18, 2021

DhairyaLGandhi commented Apr 18, 2021

rick2047 commented Apr 18, 2021

maxfreu commented Apr 28, 2021

kiranshila commented May 26, 2021 •

edited

Loading

kiranshila commented May 26, 2021

jmsull commented Dec 14, 2021

mkitti commented Dec 14, 2021

mcabbott commented Dec 14, 2021

mcabbott commented Dec 14, 2021

jmsull commented Dec 14, 2021

mkitti commented Dec 14, 2021

DhairyaLGandhi commented Dec 14, 2021

stevengj commented Nov 1, 2022

rs1909 commented Nov 16, 2022

rs1909 commented Nov 17, 2022

compatibility with Flux #396

compatibility with Flux #396

Comments

CarloLucibello commented Jan 8, 2021

mkitti commented Jan 8, 2021

CarloLucibello commented Jan 8, 2021

maxfreu commented Feb 5, 2021 • edited Loading

moesphere commented Mar 15, 2021

DhairyaLGandhi commented Apr 8, 2021

rick2047 commented Apr 8, 2021

DhairyaLGandhi commented Apr 8, 2021

rick2047 commented Apr 17, 2021

mkitti commented Apr 17, 2021

rick2047 commented Apr 18, 2021

DhairyaLGandhi commented Apr 18, 2021

rick2047 commented Apr 18, 2021

maxfreu commented Apr 28, 2021

kiranshila commented May 26, 2021 • edited Loading

kiranshila commented May 26, 2021

jmsull commented Dec 14, 2021

mkitti commented Dec 14, 2021

mcabbott commented Dec 14, 2021

mcabbott commented Dec 14, 2021

jmsull commented Dec 14, 2021

mkitti commented Dec 14, 2021

DhairyaLGandhi commented Dec 14, 2021

stevengj commented Nov 1, 2022

rs1909 commented Nov 16, 2022

rs1909 commented Nov 17, 2022

maxfreu commented Feb 5, 2021 •

edited

Loading

kiranshila commented May 26, 2021 •

edited

Loading