Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MyData and non-harvested datasets in harvested dataverses #11083

Open
plecor opened this issue Dec 11, 2024 · 8 comments
Open

MyData and non-harvested datasets in harvested dataverses #11083

plecor opened this issue Dec 11, 2024 · 8 comments
Labels
Type: Bug a defect

Comments

@plecor
Copy link
Contributor

plecor commented Dec 11, 2024

The MyData API implicitely assumes that all datasets in a dataverse linked to a harvesting client are harvested datasets. This produces incomplete results.

Steps to reproduce

  • create a dataverse and publish it
  • make this dataverse an harvest client and harvest some datasets
  • create a user with 'Dataset creator' rights in this dataverse
  • create a dataset with this user in this dataverse
  • go to 'My Data'
  • select only the role 'Dataset Creator'

Tested on 5.13, 6.0 and develop (6.4)

Results
The request to load the data fails with a NullPointerException and results are never shown:

api/v1/mydata/retrieve?selected_page=1&dvobject_types=Dataverse&dvobject_types=Dataset&published_states=Published&published_states=Unpublished&published_states=Draft&published_states=In%20Review&published_states=Deaccessioned&role_ids=5&&mydata_search_term=
java.lang.NullPointerException: Cannot invoke "java.util.List.isEmpty()" because "results" is null
at edu.harvard.iq.dataverse.mydata.MyDataFinder.runStep2DirectAssignments(MyDataFinder.java:500)
at edu.harvard.iq.dataverse.mydata.MyDataFinder.runFindDataSteps(MyDataFinder.java:177)
at edu.harvard.iq.dataverse.mydata.DataRetrieverAPI.retrieveMyDataAsJsonString(DataRetrieverAPI.java:347)
...

The underlying cause of the NPE is that all checks regarding the absence of results are made before filtering against harvested dataverses.
If the user has other datasets in another dataverse, the request doesn't fail but the dataset in the harvesting dataverse is not listed.

Are you thinking about creating a pull request for this issue?
I can look into a PR to fix the NPE.
Adapting MyData to list the missing datasets feels like a bigger undertaking.

@plecor plecor added the Type: Bug a defect label Dec 11, 2024
@pdurbin
Copy link
Member

pdurbin commented Dec 11, 2024

@plecor if you can fix the NPE, please do! Thanks for creating this issue!

plecor added a commit to plecor/dataverse that referenced this issue Dec 11, 2024
@jggautier
Copy link
Contributor

Do y'all think we could also mention this bug in the page on the guides?

The MyData API is used for the MyData page, too, right? So these datasets also won't show up on the user's MyData page? If we're not sure, I can try to test if this is the case on Demo Dataverse if you think that'll help.

@plecor
Copy link
Contributor Author

plecor commented Dec 11, 2024

The MyData API is used for the MyData page, too, right? So these datasets also won't show up on the user's MyData page? If we're not sure, I can try to test if this is the case on Demo Dataverse if you think that'll help.

Indeed, this affects the MyData page and the datasets don't show up there.

@jggautier
Copy link
Contributor

Thanks @plecor!

Since the bug won't be fixed "soon", or possibly ever, and since we know of some folks who use or at least have expressed interest in the MyData API, like @shlake and @mohhsen67, I'd advocate for at least mentioning the bug in the guides. I or someone else could open a GitHub issue for that specifically, but I'll wait to hear what you all think.

Since this affects some folks who use the MyData page, I would also advocate for noting the bug somehow on that page. But that UI change would be a bigger undertaking, compared to a documentation change, and I think it'll be less likely to be done.

@pdurbin
Copy link
Member

pdurbin commented Dec 11, 2024

Sounds good to me. @plecor in your #11086 PR can you please edit the docs to mention the bug? Or let me know and I can.

@jggautier
Copy link
Contributor

jggautier commented Dec 11, 2024

Ah okay. So I won't open a GitHub issue about editing the docs to mention the bug. Thanks

plecor added a commit to plecor/dataverse that referenced this issue Dec 12, 2024
@plecor
Copy link
Contributor Author

plecor commented Dec 12, 2024

I updated the PR to mention the issue in https://guides.dataverse.org/en/6.4/api/native-api.html#mydata

@pdurbin
Copy link
Member

pdurbin commented Dec 12, 2024

plecor added a commit to plecor/dataverse that referenced this issue Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug a defect
Projects
None yet
Development

No branches or pull requests

3 participants