Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify Dataverse Content Provider to only operate on datasets #7

Closed

Conversation

pdurbin
Copy link

@pdurbin pdurbin commented Dec 16, 2024

When the Dataverse content provider was added in jupyterhub#739 it had the flexibility to operate directly on Dataverse files like this:

repo2docker https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/6ZXAGT/3YRRYJ

However, being able to operate only on datasets (files are stored in datasets in Dataverse) is enough. That is, this will still work:

repo2docker doi:10.7910/DVN/TJCLKP

And that's all we need.

This simplification builds upon the work in jupyterhub#1388 where the content of the dataset landing page is not retrieved from the DOI of the dataset. Instead, the redirect location is fetched, which is all the Dataverse content provider needs to determine which of the 100+ installations of Dataverse hosts the DOI.

This change should be a no-op for any installation of Datavese with Binder integration enabled.

Harvard Dataverse (one of the 100+ installations) specifically is not working with Binder due to a firewall that is blocking https://dataverse.harvard.edu/citation
The simplification in this commit means that the Dataverse content provider no longer needs to follow /citation to determine what is on the other side (dataset.xhtml, file.xhtml, etc.). It assumes that the DOI is always for a dataset (not a file), which is the expectation we have always set for the Binder tool.

We are tracking Binder not working with Harvard Dataverse here: IQSS/dataverse.harvard.edu#328

…1388

When the Dataverse content provider was added in jupyterhub#739 it had the
flexibility to operate directly on Dataverse files like this:

repo2docker https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/6ZXAGT/3YRRYJ

However, being able to operate only on datasets (files are stored
in datasets in Dataverse) is enough. That is, this will still work:

repo2docker doi:10.7910/DVN/TJCLKP

And that's all we need.

This simplification builds upon the work in jupyterhub#1388 where the content
of the dataset landing page is not retrieved from the DOI of the
dataset. Instead, the redirect location is fetched, which is all
the Dataverse content provider needs to determine which of the
100+ installations of Dataverse hosts the DOI.

This change should be a no-op for any installation of Datavese with
Binder integration enabled.

Harvard Dataverse (one of the 100+ installations) specifically is
not working with Binder due to a firewall that is blocking
https://dataverse.harvard.edu/citation
The simplification in this commit means that the Dataverse
content provider no longer needs to follow `/citation` to determine
what is on the other side (dataset.xhtml, file.xhtml, etc.). It
assumes that the DOI is always for a dataset (not a file), which
is the expectation we have always set for the Binder tool.

We are tracking Binder not working with Harvard Dataverse here:
IQSS/dataverse.harvard.edu#328
@yuvipanda
Copy link
Owner

Hmm, in jupyterhub#739 - the original PR introducing this, it explicitly talks about using files. So I'd like us to not remove this functionality, as other installations may be relying on this :(

@pdurbin
Copy link
Author

pdurbin commented Dec 16, 2024

@yuvipanda I sort of doubt it. The way we advertise Binder in the Dataverse documentation is as a dataset-level tool (scope = dataset): https://guides.dataverse.org/en/6.5/admin/external-tools.html#inventory-of-external-tools

In theory, someone could navigate to https://mybinder.org directly and enter a file-level DOI. But I suspect most people will reach Binder by way of a "Binder" button in Dataverse. That is to say, via a Binder "external tools" as described in the docs above.

@yuvipanda
Copy link
Owner

Handled differently in jupyterhub#1390

@yuvipanda yuvipanda closed this Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants