Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVIDIA datasets broken/unavailable #16

Open
blue-orc opened this issue Apr 9, 2020 · 3 comments
Open

NVIDIA datasets broken/unavailable #16

blue-orc opened this issue Apr 9, 2020 · 3 comments

Comments

@blue-orc
Copy link

blue-orc commented Apr 9, 2020

I'm trying to run this series of benchmarks under the NVIDIA folder, but running into a lot of issues trying to acquire and set up these datasets properly.

The COCO dataset links here are all broken: https://github.com/mlperf/training_results_v0.6/blob/master/NVIDIA/benchmarks/maskrcnn/implementations/download_dataset.sh

I was able to download the dataset from http://cocodataset.org/ but I'm not sure where to get the weights file.

Also the imagenet dataset for the resnet benchmark is unavailable for direct download. I was able to acquire the dataset, but ran into issues when running the actual training test. The error happened at line 163 of this file: https://github.com/mlperf/training_results_v0.6/blob/master/NVIDIA/benchmarks/resnet/implementations/mxnet/train_imagenet.py#L163

I didn't copy the error but it said that there was an issue with file mapping, my guess is because I don't have it setup exactly how it was supposed to be set up because I've had to try to piece the dataset together.

Is there any updated way to acquire the exact datasets and ensure they are consistent with the published run results?

@blue-orc
Copy link
Author

blue-orc commented Apr 13, 2020

Also, directory structure for minigo bucket has changed: https://console.cloud.google.com/storage/browser/minigo-pub/ml_perf/?pli=1

The provided configuration expects that the checkpoint can be found at ml_perf/checkpoint/9, whereas /ml_perf/0.6/checkpoint seems to be the correct location.

@dumaaan
Copy link

dumaaan commented Jul 9, 2020

I second this issue. I tried to run maskrcnn implementation, but I couldn't get the weights file from anywhere.

@dumaaan
Copy link

dumaaan commented Jul 9, 2020

I solved this issue by updating the link for download_weights.sh file:

try using https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/MSRA/R-50.pkl instead of the link in the original script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants