-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactoring tests to log io behavior #31
base: main
Are you sure you want to change the base?
Conversation
I think this PR is ready to be reviewed, still work in progress. h5coro needs upstream changes to work with s3 links that don't require a login. We probably need to create a pip installable package called The main notebooks are the ones with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@betolink I received an error in the 3rd codeblock:
results = xarray_original.run(io_params)
ZeroDivisionError: division by zero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran it on a fresh clone and didn't get that error, perhaps we do some screen sharing to see what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see, there is a bug, fixing it now....
notebooks/fsspec-logs.ipynb
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@betolink I received an error in the last plotting block:
for name, group in df.groupby(['tool', 'dataset', 'format']):
tool, dataset, formated = name
x = f'{tool}, {dataset}, {formated}'
y = group['time'].mean()
ax.bar(f'{tool}, {dataset}, {formated}', group['time'].mean(), label=f'{tool}, {dataset}, {formated}', align='center')
ax.text(x, y + 0.05, f'{group["time"].mean():.2f}', ha='center', va='bottom', color='black', fontsize=8)
KeyError: 'tool'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this notebook was just prototyping some stuff, the ones that we want to plot the results are the ones named "portable-"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@betolink We should add a warning to this notebook to explain that it's not functioning due to the anonymous access issue, and/or comment all the code blocks that produce an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, or in the meantime we can use the same granules just from the CryoCloud bucket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- One small suggestion is to add a short title + description to the top of this notebook to explain its usage.
- I also got a divide by zero error here in the 3rd code block:
results = h5py_original.run(io_params)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the h5py test notebook:
- One small suggestion is to add a short title + description to the top of this notebook to explain its usage.
- I also got a divide by zero error here in the 3rd code block:
results = h5py_original.run(io_params)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't able to run many of the notebooks end to end (tested in CryoCloud). Once those errors are resolved plus suggested updates to include brief title and descriptions for the new notebooks, then it looks good on my end.
Also for our collaborators, do we need any additional documentation for them to utilize the notebooks (i.e. any breaking changes for the other CO formats that others would need to be aware of)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these file names/paths still being used, or could we remove this file since the links were updated in links.json so it's effectively in the git history?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a second storage location (different hub) for the same set of files? Or are the files different?
Saves results of experiments to results
Copied plotting from portable-full-comparison
Add plotting for end to end running
so that canonical plotting is in plot_benchmark_results.ipynb
This PR is to integrate code that logs fsspec network operations and tweak the I/O libraries (hypy, fsspec).
The fsspec-logs notebook should be a stand-alone test when ready (90% there)