Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Dropdown for Dataset Tagging in Metadata #10743

Closed
Saixel opened this issue Aug 6, 2024 · 7 comments
Closed

Implement Dropdown for Dataset Tagging in Metadata #10743

Saixel opened this issue Aug 6, 2024 · 7 comments
Assignees
Labels
FY25 Sprint 5 FY25 sprint 5 FY25 Sprint 6 FY25 Sprint 6 FY25 Sprint 7 FY25 Sprint 7 (2024-09-25 - 2024-10-09) FY25 Sprint 8 FY25 Sprint 8 (2024-10-09 - 2024-10-23) FY25 Sprint 9 FY25 Sprint 9 (2024-10-23 - 2024-11-06) FY25 Sprint 10 FY25 Sprint 10 (2024-11-06 - 2024-11-20) NIH CAFE Issues related to and/or funded by the NIH CAFE project Size: 0.5 A percentage of a sprint. 0.35 hours Status: Needs Input Applied to issues in need of input from someone currently unavailable Type: Feature a feature request

Comments

@Saixel
Copy link
Contributor

Saixel commented Aug 6, 2024

Background

In the context of managing datasets in Dataverse, it is crucial to differentiate and filter datasets that contain data, code, or a combination of both. Currently, there is no clear mechanism for tagging these content types, which hampers effective organization and search within the platform.

Feature Request

Implement a dropdown menu for tagging in the metadata, possibly in the citation field. This new metadata field will allow users to indicate whether the dataset contains data, code, or a combination of both.

Justification

This change will significantly improve the organization and searchability of datasets within Dataverse. By allowing clear differentiation between datasets containing data, code, and combinations of both, users can more easily find the resources they need, enhancing the platform's efficiency and usability.

Implementation Considerations

  • Review the appropriate metadata block to integrate this new tagging field.
  • Implement a dropdown menu to select between "Data," "Code," and "Data + Code" options.
  • Ensure the new field is easily accessible and usable for searching and filtering datasets.

Additional Context

This request arises from the need to improve dataset management and organization in Dataverse, facilitating the differentiation and searchability of datasets based on their content. This change is particularly relevant for projects handling large volumes of data and code, such as the CAFE project.

@Saixel Saixel added Type: Feature a feature request NIH CAFE Issues related to and/or funded by the NIH CAFE project labels Aug 6, 2024
@Saixel Saixel self-assigned this Aug 6, 2024
@Saixel Saixel moved this to SPRINT- NEEDS SIZING in IQSS Dataverse Project Aug 6, 2024
@Saixel Saixel changed the title Implement Dropdown for Dataset Tagging as Data, Code, or Data + Code in Metadata Implement Dropdown for Dataset Tagging in Metadata Aug 6, 2024
@Saixel Saixel added the Size: 10 A percentage of a sprint. 7 hours. label Aug 28, 2024
@Saixel Saixel moved this from SPRINT- NEEDS SIZING to SPRINT READY in IQSS Dataverse Project Aug 28, 2024
@Saixel Saixel added the FY25 Sprint 5 FY25 sprint 5 label Aug 28, 2024
@qqmyers
Copy link
Member

qqmyers commented Sep 3, 2024

Is this resolved by #10694 ?

@sekmiller
Copy link
Contributor

Not really. The core dataset types added in #10694 are "Dataset" "Workflow" and "Software", though you can add your own. Also, it's only settable when you create a Dataset via the api.

@qqmyers
Copy link
Member

qqmyers commented Sep 3, 2024

Fair enough - #10694 is only the first of several PRs, but we should make sure whether the underlying dataset type idea works for, or can work for, this use case and avoid creating a similar mechanism.

@pdurbin
Copy link
Member

pdurbin commented Sep 3, 2024

We talked about this in tech hours today, the relationship between this issue and PR #10694.

For my part, once #10694 becomes available on Harvard Dataverse, the CAFE team is welcome to use it. For now, you have to create datasets via API to set datasetType=software.

If only a facet is needed, a quick solution could be to add a dropdown to a custom metadata block to allow the user to choose between the three options explained above: "Data," "Code," and "Data + Code". Perhaps this has been the plan all along. I'm not sure. 😅

@Saixel Saixel added Size: 3 A percentage of a sprint. 2.1 hours. and removed Size: 10 A percentage of a sprint. 7 hours. labels Sep 11, 2024
@cmbz cmbz added the FY25 Sprint 6 FY25 Sprint 6 label Sep 11, 2024
@Saixel Saixel added the Status: Needs Input Applied to issues in need of input from someone currently unavailable label Sep 13, 2024
@cmbz cmbz added the FY25 Sprint 7 FY25 Sprint 7 (2024-09-25 - 2024-10-09) label Sep 25, 2024
@pdurbin pdurbin removed the Status: Needs Input Applied to issues in need of input from someone currently unavailable label Oct 7, 2024
@Saixel
Copy link
Contributor Author

Saixel commented Oct 7, 2024

As you mentioned @pdurbin, that is just what we did!

image

@sbarbosadataverse
Copy link

sbarbosadataverse commented Oct 8, 2024

Hi @Saixel are you aware of the "tagging" feature that allows depositors to tag what type of file they are depositing? Here is an image of what it looks like in demo and production. and they are searchable facets as in the second very well tagged dataset in this image.

Looks highly similar to what is being proposed above and you can name the tag anything you need it to. By default we have "code" "documentation" and "data" in production setting. You can add any other file type tagging you need.

Screen Shot 2024-10-08 at 3 45 57 PM

Just an extensive example used a little differently in social science data:
Screen Shot 2024-10-08 at 3 45 11 PM

@cmbz cmbz added the FY25 Sprint 8 FY25 Sprint 8 (2024-10-09 - 2024-10-23) label Oct 9, 2024
@Saixel Saixel added Status: Needs Input Applied to issues in need of input from someone currently unavailable Size: 0.5 A percentage of a sprint. 0.35 hours and removed Size: 3 A percentage of a sprint. 2.1 hours. labels Oct 23, 2024
@cmbz cmbz added the FY25 Sprint 9 FY25 Sprint 9 (2024-10-23 - 2024-11-06) label Oct 23, 2024
@cmbz cmbz added the FY25 Sprint 10 FY25 Sprint 10 (2024-11-06 - 2024-11-20) label Nov 7, 2024
@Saixel
Copy link
Contributor Author

Saixel commented Nov 21, 2024

After reviewing the existing tagging feature and confirming with the team, we determined that it fulfills the objectives of this request. Therefore, creating a new metadata field is unnecessary, and we will proceed with utilizing the existing functionality. Marking this as resolved.

@Saixel Saixel closed this as completed Nov 21, 2024
@github-project-automation github-project-automation bot moved this from In Progress 💻 to Done 🧹 in IQSS Dataverse Project Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FY25 Sprint 5 FY25 sprint 5 FY25 Sprint 6 FY25 Sprint 6 FY25 Sprint 7 FY25 Sprint 7 (2024-09-25 - 2024-10-09) FY25 Sprint 8 FY25 Sprint 8 (2024-10-09 - 2024-10-23) FY25 Sprint 9 FY25 Sprint 9 (2024-10-23 - 2024-11-06) FY25 Sprint 10 FY25 Sprint 10 (2024-11-06 - 2024-11-20) NIH CAFE Issues related to and/or funded by the NIH CAFE project Size: 0.5 A percentage of a sprint. 0.35 hours Status: Needs Input Applied to issues in need of input from someone currently unavailable Type: Feature a feature request
Projects
None yet
Development

No branches or pull requests

6 participants