-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: avoid double function call in ReverseDiff value_and_gradient
#729
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #729 +/- ##
==========================================
- Coverage 97.93% 97.92% -0.01%
==========================================
Files 122 122
Lines 6386 6372 -14
==========================================
- Hits 6254 6240 -14
Misses 132 132
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
So, in the current state of the PR, julia> DifferentiationInterface.value_gradient_and_hessian!(
f,
zeros(2),
zeros(2, 2),
SecondOrder(AutoFiniteDiff(), AutoReverseDiff(; compile=true)),
ones(2),
)
(0.0, [2.0, 2.0], [2.0 0.0; 0.0 2.0])
julia> DifferentiationInterface.value_gradient_and_hessian!(
f,
zeros(2),
zeros(2, 2),
SecondOrder(AutoFiniteDiff(), AutoReverseDiff(; compile=true)),
ones(2),
)
(0.0, [2.0, 2.0], [2.0 0.0; 0.0 2.0])
julia> using DifferentiationInterface, FiniteDiff, ReverseDiff
julia> DifferentiationInterface.value_and_gradient!(
f,
zeros(2),
AutoReverseDiff(; compile=true),
ones(2),
)
(2.0, [2.0, 2.0])
julia> DifferentiationInterface.value_gradient_and_hessian!(
f,
zeros(2),
zeros(2, 2),
SecondOrder(AutoFiniteDiff(), AutoReverseDiff(; compile=true)),
ones(2),
)
(0.0, [2.0, 2.0], [2.0 0.0; 0.0 2.0])
julia> # now we just add a log
julia> DifferentiationInterface.value_and_gradient!(
f,
zeros(2),
AutoReverseDiff(; compile=true),
ones(2),
)
[ Info: I'm here
(2.0, [2.0, 2.0])
julia> DifferentiationInterface.value_gradient_and_hessian!(
f,
zeros(2),
zeros(2, 2),
SecondOrder(AutoFiniteDiff(), AutoReverseDiff(; compile=true)),
ones(2),
)
[ Info: I'm here
(2.0, [2.0, 2.0], [2.0 0.0; 0.0 2.0]) |
I further boiled it down to whether I do EDIT: pure-ReverseDiff example available at JuliaDiff/ReverseDiff.jl#269 using DifferentiationInterface # version from this branch
import DifferentiationInterface as DI
using ReverseDiff: ReverseDiff
backend = AutoReverseDiff(; compile=true)
f(x) = sum(abs2, x)
x = ones(2)
prep = prepare_gradient(f, backend, zero(x))
function value_and_gradient_nested!(f, grad, prep, backend, x)
y, _ = value_and_gradient!(f, grad, prep, backend, x)
return y, grad
end julia> value_and_gradient!(f, zeros(2), prep, backend, x)
(2.0, [2.0, 2.0])
julia> value_and_gradient_nested!(f, zeros(2), prep, backend, x) # wrong
(0.0, [2.0, 2.0]) This behavior only happens with the compiled tape mode of ReverseDiff. I think it might be because the compiler struggles to figure out that ReverseDiff mutates the value of |
Personal musings: The behavior of ReverseDiff is very confusing, even on
ImmutableDiffResult
. I encountered a Heisenbug which seems to depend on the compilation path (disappears when I add aprint
statement), wherevalue_and_gradient!
suddenly becomes incorrect when called insidevalue_gradient_and_hessian!
. But only for a compiled tape. And the:gradient
tests forvalue_and_gradient!
still pass. What a mess.