Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop mocking dataverse contentprovider test #1389

Closed
wants to merge 3 commits into from

Conversation

yuvipanda
Copy link
Collaborator

@yuvipanda yuvipanda commented Dec 16, 2024

As I was debugging #1388, I realized that PR actually broke the dataverse provider, but the existing test was mocking so much that we didn't actually catch it!

IMO, since the point of contentproviders is to integrate with external content providers, they should be integration tests so we can catch issues with them more easily. Integration tests would have caught https://jupyter.zulipchat.com/#narrow/channel/103349-ask-anything/topic/Binder.20Dataverse.20error more cleanly than how it happened for example.

It also immediately showed me that some of the results we were using to test were not accurate, and I've updated the test fixtures to match what we actually expect them to return.

I will try to see if we can use a demo or test instance for the fetch test though, so we don't screw up download stats even more for the existing test doi we use.

As I was debugging jupyterhub#1388,
I realized that PR actually broke the dataverse provider, but the
existing test was mocking so much that we didn't actually catch it!

IMO, since the point of contentproviders is to integrate with
external content providers, they should be integration tests so we
can catch issues with them more easily. Integration tests would have
caught https://jupyter.zulipchat.com/#narrow/channel/103349-ask-anything/topic/Binder.20Dataverse.20error
more cleanly than how it happened for example.

This PR removes all mocks from the dataverse test, and we immediately
benefit - it shows us that the dataverse provider *only* actually
handles DOIs, and not direct URLs! So even though we technically
had tests earlier that showed our dataverse provider supporting direct
dataverse URLs, it simply was not true. So we actually catch the
failure.

I will try to see if we can use a demo or test instance for the
fetch test though, so we don't screw up download stats even more
for the existing test doi we use.
@yuvipanda
Copy link
Collaborator Author

@pdurbin is there a different DOI we can use for the 'fetch' test that doesn't screw up your stats even more? :)

from repo2docker.contentproviders import Dataverse

test_dv = Dataverse()
harvard_dv = next(_ for _ in test_dv.hosts if _["name"] == "Harvard Dataverse")
Copy link
Member

@manics manics Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining what this iterates through, and what we end up comparing in the test? Are the names Harvard Dataverse/CIMMYT Research Data considered immutable properties of the repository?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manics I'll do so, although I didn't add these - they were already there in the existing file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manics I've incorporated this into #1390, and simply removed using "host" this way completely.

@yuvipanda
Copy link
Collaborator Author

Closing in favor of #1390

@yuvipanda yuvipanda closed this Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants