-
-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add activation tests for GPU layers #1472
Conversation
Why are so many RNN tests marked broken @CarloLucibello? Was that introduced in #1367 |
Well this pr does show up some disagreements between cpu and GPU implementations which shouldn't be. #1367 was also a major overhaul and to the best of my knowledge isn't accurate in its statement of beating cudnn. With @denizyuret cudnn PR in CUDA we would be best served to benchmark the two and use the cudnn api properly. |
Test: @ModelZookeeper commands |
Here are the commands: build feed commands |
Does the bot only respond to the person who asked? I don't see any comments from the bot at all. |
The one I emoji'd on (ref #1472 (comment)) was made by the bot. I have to generate some new tokens so its not using my account for any of this. |
Ah cool!! |
Test 2: @ModelZookeeper commands |
Here are the commands: build feed commands |
@DhairyaLGandhi I wanted to review this, could you ask for reviews before merging PRs? |
|
||
batch_norm = [BatchNorm] | ||
gpu_gradtest("BatchNorm 1 with $act", batch_norm, rand(Float32, 28,28,3,4), 3, act, test_cpu = false) #TODO fix errors | ||
gpu_gradtest("BatchNorm 2 with $act", batch_norm, rand(Float32, 5,4), 5, act, test_cpu = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_cpu used to be `true, why are all false now? I think that checking that results are the same on cpu and gpu is very important
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pr doesn't change the src
There's a comment explaining this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does setting test_cpu=false
when it used to be true
doesn't change things?
I don't know which comment you refer to.
Also, there are some #TODO fix errors
, why?
# m_cpu(x_cpu) | ||
# gradient(() -> sum(m_cpu(x_cpu)), Flux.params(m_cpu)) | ||
# m_gpu(x_gpu) | ||
# gradient(() -> sum(m_gpu(x_gpu)), Flux.params(m_gpu)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why these have been commented out?
I have been waiting for any objections on this for a while actually. Pretty sure I wanted this in either way. |
gpu_gradtest("BatchNorm 2 with $act", batch_norm, rand(Float32, 5,4), 5, act, test_cpu = false) | ||
|
||
instancenorm = [InstanceNorm] | ||
gpu_gradtest("InstanceNorm with $act", instancenorm, r, 1, act, test_cpu = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of these layers were checked in both testmode and trainmode, not true anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But those are tested in the specific tests for the specific layers in normalisation.jl. if you want to add those back, I can add a follow on. It would be easier to review those separately anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be tested here as well since we currently have bugs like #1542 that are gpu only
There is a button on the top right corner where you can request a review, or you can just ping someone as everyone does all the time. I'm pretty sure we don't want to remove a lot of test coverage as this PR does. |
Since you have a lot of stuff lying around incomplete, it's hard to tell what is ready for review. So please do ask explicitly for some feedback |
This is in response to #1350, we should be routinely testing our layers with multiple activation functions to catch these errors early.