-
-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Miscellaneous fixes for MobileNet #179
Conversation
This is also another step towards #176, as all the MobileNet models now expose |
@@ -9,7 +9,7 @@ Creates a ConvMixer model. | |||
|
|||
- `planes`: number of planes in the output of each block | |||
- `depth`: number of layers | |||
- `inchannels`: number of channels in the input | |||
- `inchannels`: The number of channels in the input. The default value is 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to specify this? It's already in the docstring above, and it seems inconsistent with the other bullets (either re-specify all + capitalize all or leave it as is).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on making this happen for all the docstrings in the new PRs (that is, having the default values in there). I need to push a larger documentation change sometime in the next few weeks and that should probably cover everything
src/convnets/mobilenet.jl
Outdated
Dense(inchannels, fcsize, activation), | ||
Dense(fcsize, nclasses))) | ||
Dense(inchannels, nclasses))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is technically a bug fix but also slightly breaking. Maybe we can make a breaking release just to err on the side of caution? Deciding what to do for structural changes to the model here is tricky. cc @ToucheSir for opinions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we are going to be making a huge breaking release anyways sometime after my GSoC work lands, so this just gets added to the list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do/should we provide any guarantees about model structure across releases? I believe the only people affected would be those who've already trained models, and thus far I've not heard of any for MobileNet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or (user) code that relies on the structure but doesn't use Functors.jl. I think previously we decided to consider structural changes as breaking. But the more nuanced question here is whether a structural change that fixes an error is considered a bug fix.
I think probably here no one would complain given how recent the model definitions are.
If the slightly breaking changes hold this up too much it would be cool to factor the padding change out into a separate commit so that I can use it. |
Yes, this is the main reason I brought it up. I want to make a patch release with the recent changes before merging the breaking overhaul. |
272a6f1
to
a2e2a3d
Compare
The breaking change has been undone for now. The rest I think should fit into the patch-release without a problem |
Can you rebase with the test changes from #171? |
Co-authored-by: Kyle Daruwalla <[email protected]>
4bb065d
to
5f22ee5
Compare
....wait. How did the CI go from one check passing to all checks passing |
Something to do with #171 disabling tests for some of the largest models? |
Oh, possibly 🤦🏽 I feel like this is just 50-50 at this point as to whether the tests pass or not (the funny thing is, one test in #171 actually failed 😂). Although I do hope they keep passing.... |
Looks like the failure was a timeout instead of an OOM, so hopefully it was an anomaly! |
Should be ready for merge and release now 😄 |
Thanks! |
Closes #178. Also removes an extra fully connected layer in MobileNetv1, making it now conform to the configuration described in the paper. And some minor docs and formatting changes I caught have also made it through