Regularisation looks to slow down gradient function by factor 500 #2253

AndyAndDevid · 2023-05-04T13:32:33Z

I'm running a gradient with and without regularization in a AMD Ryzen 7 with RTX3060 GPU, Flux v0.13.15, CUDA v4.2.0, julia v1.9.0-rc3 (April 26, 2023). Independently if I use the GPU or not, when I do regularization, the time taken increases by factor between 400 (gpu) and 20000(cpu).
In a more complex model, after some iterations, the gradient looks also to increase its processing time

Here follows the example. Both Flux.gradient and Flux.withgradient shows similar performance at same conditions.

using Flux
using CUDA

function regGrad()
  ni=20
  no=4
  model = Chain(Dense(ni, 50), Dense(50, 8), Dense(8, no))
  model = model #|> gpu

  input = rand(ni) #|> gpu
  label = rand(no) #|> gpu

  pen_l2(x::AbstractArray) = sum(abs2, x) / 2

  for i in 1:10
    startTime = time_ns()
    grads = Flux.gradient(model) do m
      result = m(input)
      #penalty = sum(pen_l2, Flux.params(m))
      Flux.Losses.mse(result, label) #+ 0.42 * penalty
    end
    Dtime_grad = time_ns() - startTime
    println("without regularization: ", Dtime_grad/1000000)
  end

  for i in 1:10
    startTime = time_ns()
    loss, grads = Flux.withgradient(model) do m
      result = m(input)
      penalty = sum(pen_l2, Flux.params(m))
      Flux.Losses.mse(result, label) + 0.42 * penalty
    end
    Dtime_wgrad = time_ns() - startTime
    println("with regularization: ", Dtime_wgrad/1000000)
  end
end


Results in ms:
without regularization: 0.0762
without regularization: 0.0275
without regularization: 0.0287
without regularization: 0.0325
without regularization: 0.0252
without regularization: 0.0215
without regularization: 0.0254
without regularization: 0.029
without regularization: 0.0215
without regularization: 0.024
with regularization: 503.5599
with regularization: 513.9299
with regularization: 512.6677
with regularization: 517.6622
with regularization: 519.182
with regularization: 527.7295
with regularization: 513.0565
with regularization: 529.4732
with regularization: 541.2436
with regularization: 549.6674

am I doing something wrong?

The text was updated successfully, but these errors were encountered:

christiangnrd · 2023-05-04T13:47:34Z

Take a look at #2211 and #2040 to see if it's the same issue. #2211 has some troubleshooting step you might want to follow.

Also, you should surround your code in "```" so that it gets displayed properly.

Like this:
```
println("Hello, World!")
```
will show up as:

println("Hello, World!")

AndyAndDevid · 2023-05-04T18:55:33Z

Yes, definitive it looks to be the same issue as [https://github.com//issues/2211] and [https://github.com//issues/2040].
Thanks! Hopefully it will be fixed soon!

darsnack · 2023-05-04T23:50:52Z

I have posted a workaround until the fix: #2040 (comment)

AndyAndDevid closed this as completed May 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regularisation looks to slow down gradient function by factor 500 #2253

Regularisation looks to slow down gradient function by factor 500 #2253

AndyAndDevid commented May 4, 2023 •

edited

Loading

christiangnrd commented May 4, 2023

AndyAndDevid commented May 4, 2023

darsnack commented May 4, 2023

Regularisation looks to slow down gradient function by factor 500 #2253

Regularisation looks to slow down gradient function by factor 500 #2253

Comments

AndyAndDevid commented May 4, 2023 • edited Loading

christiangnrd commented May 4, 2023

AndyAndDevid commented May 4, 2023

darsnack commented May 4, 2023

AndyAndDevid commented May 4, 2023 •

edited

Loading