Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connect flatten, conv2d, and maxpool2d layers in backward pass #142

Merged
merged 3 commits into from
Jun 22, 2023

Conversation

milancurcic
Copy link
Member

cnn_mnist was previously converging to a low ~93% accuracy solution because the convolutional layers were disconnected in the backward pass, so only the output dense layer was being trained. This PR is a WIP that connects these layers. Now that they're connected, cnn_mnist doesn't converge due to not yet uncovered issues in the backward passed of conv2d and possibly maxpool2d layers as well.

@milancurcic milancurcic added the bug Something isn't working label Jun 15, 2023
@milancurcic milancurcic self-assigned this Jun 15, 2023
@milancurcic milancurcic mentioned this pull request Jun 21, 2023
@milancurcic milancurcic merged commit 6bbc28d into modern-fortran:main Jun 22, 2023
@milancurcic milancurcic deleted the fix-conv2d-backprop branch June 22, 2023 15:27
@certik
Copy link
Contributor

certik commented Mar 21, 2024

Git bisect revealed this PR to drop the training from > 90% to ~10% (#145 (comment)), however based on this PR description it seems this PR fixed some issues and there are more issues to fix.

Was there ever a time when the training fully worked?

@milancurcic
Copy link
Member Author

Good question. There was a time when I thought it was converging (although not at the expected level, e.g. 96% or so accuracy, but rather at mid-80% IIRC) because I wrote a poor test. However, I think 2-d CNN training never worked correctly; it's implemented and likely a bug or two away from working but I haven't made it a priority to fix it yet. We know that inference works because we can load a pre-trained Keras CNN and infer with high accuracy. So the bug(s) is somewhere in the backward pass of one or more of conv2d, maxpool2d, flatten, and/or reshape layers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants