Skip to content

Dropout layer #194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Feb 21, 2025
Merged

Dropout layer #194

merged 34 commits into from
Feb 21, 2025

Conversation

milancurcic
Copy link
Member

@milancurcic milancurcic commented Jan 23, 2025

  • Dropout layer type and methods
  • Enable forward pass
  • Enable backward pass
  • Test
  • Example (in examples/dense_mnist.f90)

Closes #170.

@milancurcic milancurcic added the enhancement New feature or request label Jan 23, 2025
@ricor07
Copy link
Collaborator

ricor07 commented Jan 24, 2025

Hello Milan. About one out of 25 times, when running test_dropout_layer this occurs:
image

If you can't fix it today, I'll work on it tomorrow. Have a good afternoon

Edit: I adjusted it in #195.

There is still a case, in which all the units turn off. This occurs with a probability of (1 / dropout_rate) ^ n_units
In this case, output value is equal to NAN. In my opinion, we should add a control in which output is set to 0. Let me know what you think.

@milancurcic milancurcic mentioned this pull request Jan 27, 2025
@milancurcic
Copy link
Member Author

You're right, currently the the effective dropout rate is not exactly the specified dropout_rate because in this (naive) implementation we simply initialize the mask with a uniform random distribution and set it to 0 if it falls below dropout_rate and 1 otherwise. One way to handle it is to rescale the sample from random_number such that the dropout_rate becomes the percentile of the sample.

Regarding the test failing at 1/25, notice that each test run makes 10000 dropout runs, so the effective failure rate is (10^-5). I think it's the precision issue (i.e. tolerance too small in the test), rather than the scaling of the outputs. However, you may still be correct that there's an issue with scaling the outputs, as you suggested in #195, I just don't think it's related to the test failure.

@ricor07
Copy link
Collaborator

ricor07 commented Jan 27, 2025

I think we should handle the calculation of output values and even redefining the dropout, if needed. If you want to do a meeting in order to discuss this, i'm here. just answer me within 3 hours please

@milancurcic
Copy link
Member Author

Hi, I just looked at my code (haven't looked earlier when I wrote here), and I see the very serious and obvious bug in calculating the scale, which your approach fixes. No proof needed :). I'll write in the other PR on the process on opening a PR against this one, rather than a new one.

@ricor07
Copy link
Collaborator

ricor07 commented Feb 7, 2025

Hello, aren't you still merging? In the next days I'll be working on the issue i told you the other time (sorry, to look the number is boring). Which neural-fortran version should I use?

@milancurcic
Copy link
Member Author

Hi Ricardo, I applied your fix in 59cc7e1.

This PR is getting closer to complete but still WIP. I need to figure out how to best enable/disable the dropout in training/inference modes, but without changing the state on the layer under the hood, which is not very user-friendly. Stay tuned.

@OneAdder OneAdder mentioned this pull request Feb 9, 2025
@milancurcic
Copy link
Member Author

There is an outstanding bug; dropout(0.) should have no effect, however it does.

@milancurcic milancurcic requested a review from jvdp1 February 20, 2025 20:21
@milancurcic
Copy link
Member Author

@jvdp1 I believe this is now done; we can review together tomorrow.

Copy link
Collaborator

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Here are a few cooments. I also opened a PR here

@milancurcic
Copy link
Member Author

Thank you all! Will merge.

@milancurcic milancurcic merged commit 039638d into modern-fortran:main Feb 21, 2025
4 checks passed
@milancurcic milancurcic deleted the dropout branch February 21, 2025 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement dropout layer
3 participants