Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ImageNet #146
Add ImageNet #146
Changes from 20 commits
7852e48
20be9a6
59afe92
1f4dfaf
dfdeaa5
d2ded7e
4809296
cac14d2
1e850ba
2aa0170
06ad214
3097302
02e966d
9fb811c
4e0e8d4
0daca90
09feb3d
df14fea
c92ae00
8637ebe
944bd83
fe38d43
6af86c6
95b13d9
09d5be4
ae1929d
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would drop the
pytorch
prefix/suffix and use something else. The comments can stay. If PyTorch isn't the only library that does this preprocessing, then it makes sense to represent that with more general names. If different libraries are providing different preprocessing functionality for ImageNet (or not providing any), then I'd argue there is no canonical default set of ImageNet transformations and this code (aside from maybe the descriptive stats) shouldn't be in MLDatasets.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Since this is just an internal function used by
default_preprocess
, I would suggest either_normalize
ordefault_normalize
. The appeal of using these coefficients as defaults is that they should work out of the box with pre-trained vision models from Metalhead.jl.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, so do other libraries provide this functionality in their ImageNet dataset APIs? I checked https://www.tensorflow.org/datasets/catalog/imagenet2012 and it has no mention of preprocessing, so is PyTorch the only library that does this? If so, I would vote to remove the preprocessing functions as mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am not wrong, these normalization values depend on the model you are using. Also, none of the existing vision datasets have preprocessing functions. These functions are ideally handled by data preprocessing libraries/modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The norm values should not be model-specific. They're derived directly from the data before any model is involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the pytorch case, notice however that although the transformations are stored in the "model weights", the mean and std is the same across models (see e.g. the mobilenet model).
In a similar spirit, I would definitely defend the decision of shipping the set of transformations (cropping, interpolation, linear transformation, etc) as part of the dataset. However I agree with the very first point that the name
transformation_pytorch
isn't really precise, although I think it is fair to link to the corresponding transformations for tensorflow, pytorch, and/or the timm library in a related comment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyTorch also lumps code for pretrained models, data augmentations and datasets into one library, I don't think we need to follow their every example :)
This is precisely why I asked about what other libraries are doing. If nobody else is shipping the same set of transformations, then they can hardly be considered canonical for ImageNet. That doesn't mean we should never ship helpers to create common augmentation pipelines, but that it is better served by packages which have access to efficient augmentation libraries (e.g. Augmentor, DataAugmentation) and not by some unoptimized implementation which is simultaneously more general (because it's applicable to other datasets) and less general (because many papers using ImageNet do not use these augmentations) than the dataset it's been attached to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just apply the
channelview
and permute transformation by default here,and make the (permuted) mean and std values be part of the type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have also taken a look at Keras' ImageNet utilities. While these normalization constants are used in many places throughout torchvision and PyTorch, it looks like TensorFlow and Keras do indeed use their own constants.
I agree with @ToucheSir's sentiment
However, this point can be drawn even further, as nothing about ImageNet is truly canonical.
To give some examples (some of which have previously been discussed):
jpeg_decode
with apreferred_size
.Getting this merged
So make this dataloader as "unopinionated" as possible, we could just make it a very thin wrapper around
FileDataset
which just loads metadata. This would require the user to pass aloadfn
which handles the transformation from file path to array. Class ordering could be handled using asort_by_wnid=True
keyword argument and all new dependencies introduced in this PR could be removed (ImageCore, JpegTurbo and StackViews).Future work
However, I do strongly feel like some package in the wider Julia ML / Deep Learning ecosystem should export
loadfn
s that are usable with Metalhead's PyTorch models out of the box. @lorenzoh previously proposed adding such functionality to DataAugmentation.jl in FluxML/Metalhead.jl#117.Once this functionality is available somewhere,
ImageNet
's docstring in MLDataset should be updated to showcase this common use-case.Until this functionality exists, I would suggest adding a
"Home" => "Tutorials" => "ImageNet"
page to the MLDatasets docs which implements the current load function.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find. I was not expecting that
mode == "torch"
conditional.The difference here is that all three of those points can have a decent fallback without depending on external packages. Another argument is that more people will rely on these defaults than won't. I'm not sure augmentations pass that threshold.
I'm not saying users shouldn't be able to pass in a transformation function, but
identity
or some such seems a more defensible default. Indeed, the torchvisionImageNet
class does not do any additional transforms by default, so we'd be deviating from every other library if we stuck with this default centre crop.