Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnitRange calculations change types when using Zygote #557

Open
marekdedic opened this issue Mar 25, 2020 · 4 comments
Open

UnitRange calculations change types when using Zygote #557

marekdedic opened this issue Mar 25, 2020 · 4 comments
Labels
bug Something isn't working

Comments

@marekdedic
Copy link

Hi,
I've stumbled upon this really weird bug where I was getting invalid types in my code, but only if the piece of code in question was executed with Zygote...

I've simplified the example to this:

using Zygote;

function f(x)
    println(x);
    println(x.start);
    println(x .- x.start);
    return x .- x.start;
end

This produces the following outputs:

julia> f(5:5);
5:5
5
0:0

julia> f'(5:5);
5:5
5
[0]

As you can see, in the first run, we get the range 0:0 whereas in the second one, we get the array [0]. I believe this is what's been breaking all the rest of my code, which assumes x .- x.start would be a UnitRange

@oxinabox oxinabox added the bug Something isn't working label Mar 25, 2020
@marekdedic
Copy link
Author

Sorry, cut off the error message, my bad:

julia> f'(5:5);
5:5
5
[0]
ERROR: Output should be scalar; gradients are not defined for output [0]
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] sensitivity(::Array{Int64,1}) at /home/user/.julia/packages/Zygote/S1EoU/src/compiler/interface.jl:41
 [3] gradient(::Function, ::UnitRange{Int64}) at /home/user/.julia/packages/Zygote/S1EoU/src/compiler/interface.jl:45
 [4] (::Zygote.var"#40#41"{typeof(f)})(::UnitRange{Int64}) at /home/user/.julia/packages/Zygote/S1EoU/src/compiler/interface.jl:48
 [5] top-level scope at REPL[6]:1

@mcabbott
Copy link
Member

mcabbott commented Mar 28, 2020

A few examples from playing with this, all functions make a scalar which gradient accepts. Number 3 is your complaint?

julia> showtype(x) = (println("+ ", typeof(x)); x);

julia> Zygote.@adjoint showtype(x) = showtype(x), dx -> (println("- ", typeof(dx)); (dx,))

julia> gradient(x -> sqrt(showtype(showtype(x)[1])), 5:5) # 1. makes an Array as gradient, but how could it not?
+ UnitRange{Int64}
+ Int64
- Float64
- Array{Float64,1}
([0.22360679774997896],)

julia> gradient(x -> sum(showtype(showtype(x) .- x.start)), 5:5) # 2. makes an Array on forward pass
+ UnitRange{Int64}
+ Array{Int64,1}
- FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}
- Array{Int64,1}
ERROR: MethodError: no method matching +(::NamedTuple{(:start, :stop),Tuple{Int64,Nothing}}, ::Array{Int64,1})

julia> gradient(x -> exp(sum(showtype(showtype(x) .- 10))), 5:5) # 3. ditto, but no error
+ UnitRange{Int64}
+ Array{Int64,1}
- FillArrays.Fill{Float64,1,Tuple{Base.OneTo{Int64}}}
- Array{Float64,1}
([0.006737946999085467],)

julia> gradient(x -> exp(sum(showtype(showtype(x) .+ 10))), 5:5) # 4. with + it stays a range
+ UnitRange{Int64}
+ UnitRange{Int64}
- FillArrays.Fill{Float64,1,Tuple{Base.OneTo{Int64}}}
- FillArrays.Fill{Float64,1,Tuple{Base.OneTo{Int64}}}
([3.2690173724721107e6],)

julia> showtype((5:5) .- 10) # like 3 but no gradient
+ UnitRange{Int64}
-5:-5

The difference between + and - points to the following awful hack of https://github.com/FluxML/Zygote.jl/blob/master/src/lib/broadcast.jl#L68 :

julia> using Base.Broadcast: Broadcasted, AbstractArrayStyle, broadcasted, materialize
julia> using Zygote: Numeric, unbroadcast
julia> Zygote.@adjoint broadcasted(::typeof(-), x::Numeric, y::Numeric) = 
              broadcast(-, x, y), dz -> (nothing, unbroadcast(x, dz), -unbroadcast(y, dz))

julia> gradient(x -> exp(sum(showtype(showtype(x) .- 10))), 5:5) # 3 again
+ UnitRange{Int64}
+ UnitRange{Int64}
- FillArrays.Fill{Float64,1,Tuple{Base.OneTo{Int64}}}
- FillArrays.Fill{Float64,1,Tuple{Base.OneTo{Int64}}}
([0.006737946999085467],)

@marekdedic
Copy link
Author

julia> showtype(x) = (println("+ ", typeof(x)); x);

julia> Zygote.@adjoint showtype(x) = showtype(x), dx -> (println("- ", typeof(dx)); (dx,))
^[[A^[[A
julia> function f(x)
           showtype(x);
           showtype(x.start);
           showtype(x .- x.start);
           return x .- x.start;
       end
f (generic function with 1 method)

julia> f(5:5);
+ UnitRange{Int64}
+ Int64
+ UnitRange{Int64}

julia> f'(5:5);
+ UnitRange{Int64}
+ Int64
+ UnitRange{Int64}
ERROR: Output is an array, so the gradient is not defined. Perhaps you wanted jacobian.
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] sensitivity(y::UnitRange{Int64})
   @ Zygote ~/.julia/packages/Zygote/nyzjS/src/compiler/interface.jl:113
 [3] (::Zygote.var"#84#85"{typeof(f)})(x::UnitRange{Int64})
   @ Zygote ~/.julia/packages/Zygote/nyzjS/src/compiler/interface.jl:155
 [4] top-level scope
   @ REPL[9]:1

However, the output is not an array...

@ToucheSir
Copy link
Member

This is what the error is saying:

julia> 0:4 isa AbstractArray
true

julia> UnitRange{Int} <: AbstractVector
true
That said, trying to pull back a range doesn't work either:
# back = pullback(f, 0:4)[2]

julia> back(0:4)
ERROR: MethodError: no method matching +(::Vector{Float64}, ::@NamedTuple{start::Int64, stop::Nothing})
The function `+` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  +(::Any, ::Any, ::Any, ::Any...)
   @ Base operators.jl:596
  +(::ChainRulesCore.NoTangent, ::Any)
   @ ChainRulesCore ~/.julia/packages/ChainRulesCore/6Pucz/src/tangent_arithmetic.jl:59
  +(::Any, ::ChainRulesCore.NotImplemented)
   @ ChainRulesCore ~/.julia/packages/ChainRulesCore/6Pucz/src/tangent_arithmetic.jl:25
  ...

Stacktrace:
  [1] accum(x::Vector{Float64}, y::@NamedTuple{start::Int64, stop::Nothing})
    @ Zygote ~/.julia/packages/Zygote/nyzjS/src/lib/lib.jl:17
  [2] accum(::Vector{Float64}, ::@NamedTuple{start::Int64, stop::Nothing}, ::Nothing, ::Vararg{Nothing})
    @ Zygote ~/.julia/packages/Zygote/nyzjS/src/lib/lib.jl:22
  [3] NamedTuple
    @ ./boot.jl:732 [inlined]
  [4] map
    @ ~/.julia/packages/ChainRulesCore/6Pucz/src/tangent_types/structural_tangent.jl:134 [inlined]
  [5] wrap_chainrules_output
    @ ~/.julia/packages/Zygote/nyzjS/src/compiler/chainrules.jl:121 [inlined]
  [6] _project
    @ ~/.julia/packages/Zygote/nyzjS/src/compiler/chainrules.jl:190 [inlined]
  [7] back
    @ ~/.julia/packages/Zygote/nyzjS/src/lib/lib.jl:234 [inlined]
  [8] #2180#back
    @ ~/.julia/packages/ZygoteRules/M4xmc/src/adjoint.jl:72 [inlined]
  [9] f
    @ ./REPL[3]:5 [inlined]
 [10] (::Zygote.Pullback{Tuple{…}, Tuple{…}})(Δ::UnitRange{Int64})
    @ Zygote ~/.julia/packages/Zygote/nyzjS/src/compiler/interface2.jl:0
 [11] (::Zygote.var"#78#79"{Zygote.Pullback{Tuple{…}, Tuple{…}}})(Δ::UnitRange{Int64})
    @ Zygote ~/.julia/packages/Zygote/nyzjS/src/compiler/interface.jl:91
 [12] top-level scope
    @ REPL[10]:1
Some type information was truncated. Use `show(err)` to see complete types.

julia> back((start=0, stop=4))
ERROR: MethodError: no method matching ndims(::@NamedTuple{start::Int64, stop::Int64})
The function `ndims` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  ndims(::Type{Union{}}, Any...)
   @ Base abstractarray.jl:276
  ndims(::Type{<:Ref})
   @ Base refpointer.jl:102
  ndims(::Type{<:AbstractChar})
   @ Base char.jl:197
  ...

Stacktrace:
 [1] unbroadcast(x::UnitRange{Int64}, x̄::@NamedTuple{start::Int64, stop::Int64})
   @ Zygote ~/.julia/packages/Zygote/nyzjS/src/lib/broadcast.jl:57
 [2] (::Zygote.var"#1207#1210"{UnitRange{Int64}, Int64})(Δ::@NamedTuple{start::Int64, stop::Int64})
   @ Zygote ~/.julia/packages/Zygote/nyzjS/src/lib/broadcast.jl:86
 [3] (::Zygote.var"#3804#back#1211"{Zygote.var"#1207#1210"{UnitRange{Int64}, Int64}})(Δ::@NamedTuple{start::Int64, stop::Int64})
   @ Zygote ~/.julia/packages/ZygoteRules/M4xmc/src/adjoint.jl:72
 [4] f
   @ ./REPL[3]:5 [inlined]
 [5] (::Zygote.Pullback{Tuple{…}, Tuple{…}})(Δ::@NamedTuple{start::Int64, stop::Int64})
   @ Zygote ~/.julia/packages/Zygote/nyzjS/src/compiler/interface2.jl:0
 [6] (::Zygote.var"#78#79"{Zygote.Pullback{Tuple{…}, Tuple{…}}})(Δ::@NamedTuple{start::Int64, stop::Int64})
   @ Zygote ~/.julia/packages/Zygote/nyzjS/src/compiler/interface.jl:91
 [7] top-level scope
   @ REPL[11]:1
Some type information was truncated. Use `show(err)` to see complete types.

Also, I can no longer replicate the original type change:

julia> f'(5:5);
5:5
5
0:0
ERROR: Output is an array, so the gradient is not defined. Perhaps you wanted jacobian.
...

As such, I would recommend closing this issue and doing one of the following:

  • Surround your range manipulation code with https://juliadiff.org/ChainRulesCore.jl/stable/api.html#Ignoring-gradients. Technically, Zygote should treat integers and arrays of integers as non-differentiable. It's just not smart enough to do that.
  • If you really want to differentiate with respect to a range and are ok with it being treated like an array of Ints, open a separate issue for any bugs you run into while doing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants