Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add activation tests for GPU layers #1472

Merged
merged 22 commits into from
Mar 24, 2021
Merged

Add activation tests for GPU layers #1472

merged 22 commits into from
Mar 24, 2021

Conversation

DhairyaLGandhi
Copy link
Member

This is in response to #1350, we should be routinely testing our layers with multiple activation functions to catch these errors early.

@DhairyaLGandhi
Copy link
Member Author

Why are so many RNN tests marked broken @CarloLucibello? Was that introduced in #1367

@CarloLucibello
Copy link
Member

CarloLucibello commented Jan 22, 2021

Was that introduced in #1367

Is this a question? If yes, I don't think so, #1367 fixed many issues and actually unbroke one test

I don't remember when tests were marked as broken, could try with git blame

@DhairyaLGandhi
Copy link
Member Author

Well this pr does show up some disagreements between cpu and GPU implementations which shouldn't be.

#1367 was also a major overhaul and to the best of my knowledge isn't accurate in its statement of beating cudnn. With @denizyuret cudnn PR in CUDA we would be best served to benchmark the two and use the cudnn api properly.

@DhairyaLGandhi
Copy link
Member Author

Test:

@ModelZookeeper commands

@DhairyaLGandhi
Copy link
Member Author

Here are the commands: build feed commands

@darsnack
Copy link
Member

darsnack commented Feb 1, 2021

Does the bot only respond to the person who asked? I don't see any comments from the bot at all.

@DhairyaLGandhi
Copy link
Member Author

DhairyaLGandhi commented Feb 1, 2021

The one I emoji'd on (ref #1472 (comment)) was made by the bot. I have to generate some new tokens so its not using my account for any of this.

@darsnack
Copy link
Member

darsnack commented Feb 1, 2021

Ah cool!!

@DhairyaLGandhi
Copy link
Member Author

Test 2:

@ModelZookeeper commands

@ModelZookeeper
Copy link

Here are the commands: build feed commands

@DhairyaLGandhi DhairyaLGandhi linked an issue Mar 23, 2021 that may be closed by this pull request
@DhairyaLGandhi DhairyaLGandhi merged commit 5c40716 into master Mar 24, 2021
@CarloLucibello
Copy link
Member

@DhairyaLGandhi I wanted to review this, could you ask for reviews before merging PRs?


batch_norm = [BatchNorm]
gpu_gradtest("BatchNorm 1 with $act", batch_norm, rand(Float32, 28,28,3,4), 3, act, test_cpu = false) #TODO fix errors
gpu_gradtest("BatchNorm 2 with $act", batch_norm, rand(Float32, 5,4), 5, act, test_cpu = false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_cpu used to be `true, why are all false now? I think that checking that results are the same on cpu and gpu is very important

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pr doesn't change the src

There's a comment explaining this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does setting test_cpu=false when it used to be true doesn't change things?
I don't know which comment you refer to.
Also, there are some #TODO fix errors, why?

# m_cpu(x_cpu)
# gradient(() -> sum(m_cpu(x_cpu)), Flux.params(m_cpu))
# m_gpu(x_gpu)
# gradient(() -> sum(m_gpu(x_gpu)), Flux.params(m_gpu))
Copy link
Member

@CarloLucibello CarloLucibello Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these have been commented out?

@DhairyaLGandhi
Copy link
Member Author

I have been waiting for any objections on this for a while actually. Pretty sure I wanted this in either way.

gpu_gradtest("BatchNorm 2 with $act", batch_norm, rand(Float32, 5,4), 5, act, test_cpu = false)

instancenorm = [InstanceNorm]
gpu_gradtest("InstanceNorm with $act", instancenorm, r, 1, act, test_cpu = false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of these layers were checked in both testmode and trainmode, not true anymore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But those are tested in the specific tests for the specific layers in normalisation.jl. if you want to add those back, I can add a follow on. It would be easier to review those separately anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be tested here as well since we currently have bugs like #1542 that are gpu only

@CarloLucibello
Copy link
Member

CarloLucibello commented Mar 24, 2021

I have been waiting for any objections on this for a while actually. Pretty sure I wanted this in either way.

There is a button on the top right corner where you can request a review, or you can just ping someone as everyone does all the time.

I'm pretty sure we don't want to remove a lot of test coverage as this PR does.

@CarloLucibello
Copy link
Member

Since you have a lot of stuff lying around incomplete, it's hard to tell what is ready for review. So please do ask explicitly for some feedback

@DhairyaLGandhi DhairyaLGandhi deleted the dg/acttests branch May 13, 2021 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ConvTranspose on GPU fails with certain activation functions
4 participants