-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "Relation Type" to related publication metadata fields to send DataCite related publication metadata #2778
Comments
commenting for reasons of tracking this thread and opinion, but jumping on board with DOI/DataCite relation types is something Dataverse really should do. the controlled vocab and use cases are there for the reuse. i'm sure there are UI issues downstream from implementing it though. |
In the article "Pre-Metadata Counseling: Putting the DataCite relationType Attribute into Action" (PDF), the team behind Illinois Data Bank write about using DataCite's relationType terms. They decide in their longer "overly honest" article about the repository's development that the terms are too difficult to use (hard to define, too much overlap) to expect depositors to apply them consistently, so their curators apply them after dataset publication and they use only six of the 25. This makes me think it would be a good idea to lessen the confusion when this is implemented by:
|
The six seem to be "Article, Code, Dataset, Presentation, Thesis, or Other" unless I misunderstand. This issue may be a candidate for adding a user story (as discussed in retrospective this afternoon) because I'm getting a little lost on who wants what and why. |
Those are the object types, Phil. The 6 relations are IsSupplementTo, IsCitedBy, IsNewVersionOf/IsPreviousVersionOf and the IsPartOf/HasPart pair. I think narrowing terms/relations is wise; it has always felt like the DataCite list was made to appear to cover edge cases, which leads to unnecessary confusion. I don't see much overlap in those 6 relations though someone who hasn't thought it through for a particular object might. I think enabling the addition of multiple relations in a single UI click by a depositor isn't solving an actual problem. Allowing a 1 click addition of 3 relations between Object A and Object B would be allowing a depositor to think they are confused by the terms and giving them a quick out from their confusion. And it adds pure noise on the resulting network graph. Instead, defining what Dataverse thinks the relationTypes it supplies mean and giving examples of how to use them would be a better solution to the problem of depositor confusion. |
Thanks @augustfly. Looks like the Scholix metadata schema (page 8-9) also limits the relationship types (IsReferencedBy, References, IsSupplementTo, IsSupplementedBy) and adds a catch-all (IsRelatedTo).
Ah, I didn't mean to suggest enabling the addition of multiple relations in a single UI click (writing "considering the overlap in terms" does make it seem like this is what I was getting at). I meant that as a depositor, I might want to say that dataset A is related to article B in two or more distinct ways (ex. the dataset supports findings in my article, and I also cite the dataset in my article). This is what the Illinois Data Bank folks decided to do: pg. 214 (PDF)
Is including these two, or multiple relationship types, in the metadata important enough for these knowledge graphs? Would it just be noise to include both isSupplementTo and isCitedBy? I agree that defining what Dataverse thinks the relationTypes it supplies mean and giving depositors examples of how to use them is a great way to go about it. Lastly, I've been working on mapping Dataverse metadata to DataCite 4.1 so that OAI-PMH harvesting can be done using DataCite (#4318). If we want relatedPublication metadata to be included in the DataCite xml, the DataCite schema requires relationType. Since this isn't something Dataverse lets depositors specify, I'm wondering if, until depositors are able to choose required relationType(s) in the UI, it's okay to make one relationType default. @juancorr, is this what https://edatos.consorciomadrono.es/ does, using isCitedBy for every related publication? If it's safe to choose a default now for related publications that weren't assigned relationship types when published, is it safe to simply make that relationship type the default choice of many choices (e.g. in a pulldown list in an adjusted dataset create/edit form)? |
I would love to see this, as our primary publication target is DataCite-based and mints new DOIs for new versions of publications. Without the relationType IsNewVersionOf/IsPreviousVersionOf, I can't think of a way to accurately describe this. Also, being fully DataCite-compatible would be great, as discussed in #4318. All the best |
Thanks @RightInTwo! When you write "publications", I'm assuming you mean published datasets. (Let me know if that's inaccurate :) So it will be helpful if depositors are able to say that the dataset they're depositing IsNewVersionOf/IsPreviousVersionOf another dataset. This also makes me realize that not all of the relationships, however many we choose, will be appropriate options for describing the relationship between two research objects. For example, there's no reason to give dataset depositors the option of saying that the dataset they're depositing IsNewVersionOf an article they've published. This work is being considered as part of Dataverse's current grant-funded commitment to publish data use and citation metrics following the Code of Practice for Research Data Usage standard (Make Data Count), since the Event Data service that's aggregating the metadata in order to generate citation counts needs the "relatedIdentifier" metadata, which Dataverse needs to send to DataCite when registering DOIs (or updating the metadata of already registered DOIs, #5144). Could we update the dataset metadata form to let depositors say how the research object is related to the dataset they're depositing, maybe using a new metadata field? That would mean a fifth field, maybe a dropdown menu, in the "Related Publication" compound field, and a second field for "Related Datasets" (which would become a compound field). Of the six relations that the Illinois Data Bank settled on, I'm proposing using: For "Related Publication":
For "Related Dataset":
(I'm not recommending that Dataverse display those phrases in the UI. Zenodo tries to clarify with longer phrases: |
@jggautier Any thoughts on I'm a little confused about how a dataset would cite another dataset, but that's likely because I haven't looked at the spec for a while. |
@jggautier yes, i am talking about published datasets In regards to the relation types, Datacite defines a whole bunch of them:
They are described in the appendix of the version 4.1 specification. |
Thanks @RightInTwo. I mentioned 25 in an earlier comment; didn't realize so many more had been added since version 3.1! I hadn't considered the whole list yet, just wanted to get the conversation started (but I agree isCitedBy doesn't make sense to me right now). Maybe we could get more feedback from depositors and curators of different types of datasets. IsSourceOf makes me think of one dataset being derived from another. "Here's a dataset that was used to create this one, but it's not a new version or a part of that first dataset." |
@jggautier That's the exact usage of |
@jggautier what would happen in the case that i import metadata from another source that actually uses |
One idea would be to map the types we don't use to the types we do. (Would probably be helpful to consider the types different repositories use now.) So if someone imports metadata with a relationtype of isReferencedBy, Dataverse changes that type to isCitedBy. This conversion would be published someplace so people importing metadata know. And later if changes are asked for by people importing metadata with the types that Dataverse converts to another type, we can reconsider adding/changing types. |
@jggautier so the import would fail in that case? as long as it does not just ignore that field or fails silently, that sounds good |
Do we have any updates on this issue? Here is what Datacite recommends us to do - https://support.datacite.org/docs/contributing-citations-and-references |
Hey @eugene-barsky. Thanks for pointing that page out! There's more recent discussion about this issue in the broader GitHub issue at #8108, where @KellyStathis from DataCite offered advice about using relationTypes and pointed to https://support.datacite.org/docs/connecting-to-works, which now eventually leads to https://support.datacite.org/docs/contributing-citations-and-references Folks at Harvard that are part of the NIH's Generalist Repository Ecosystem Initiative (GREI) need this issue resolved, too. @KellyStathis, in email discussions in July with Harvard members of the GREI group, also pointed out https://support.datacite.org/docs/contributing-citations-and-references and made some general recommendations. And in meetings coming up this month, the GREI groups will be joined by one or more folks from DataCite who will be able to help with metadata questions like this one. But I think no decision has been made and implemented partly because even the incredibly helpful guide at https://support.datacite.org/docs/contributing-citations-and-references leaves enough room for debate and the community hasn't found the time to build consensus. I've been imagining that as the GREI work continues, the Harvard folks in the working groups, including me, can learn from more of the Dataverse community (something @qqmyers also recommended in a related pull request), and what we learn can inform the GREI work (and perhaps DataCite's recommendations). @amberleahey and @philippconzett asked in a Google Groups thread last week if a Dataverse Metadata WG/IG meeting could be scheduled to discuss this. I'll also ping @mreekie, who is catching up on this and other related issues. |
Thanks so much, Julian. I will also be delighted to join the Dataverse
Metadata WG/IG meeting when it meets to discuss this.
E.
…On Tue, Sep 27, 2022 at 10:15 AM Julian Gautier ***@***.***> wrote:
[*CAUTION:* Non-UBC Email]
Hey @eugene-barsky <https://github.com/eugene-barsky>. Thanks for
pointing that page out!
There's more recent discussion about this issue in the broader GitHub
issue at #8108 <#8108>, where
@KellyStathis <https://github.com/KellyStathis> from DataCite offered
advice about using relationTypes and pointed to
https://support.datacite.org/docs/connecting-to-works, which now
eventually leads to
https://support.datacite.org/docs/contributing-citations-and-references
Folks at Harvard that are part of the NIH's Generalist Repository
Ecosystem Initiative (GREI)
<https://datascience.nih.gov/news/nih-office-of-data-science-strategy-announces-new-initiative-to-improve-data-access>
need this issue resolved, too. @KellyStathis
<https://github.com/KellyStathis>, in email discussions in July with
Harvard members of the GREI group, also pointed out
https://support.datacite.org/docs/contributing-citations-and-references
and made some general recommendations. And in meetings coming up this
month, the GREI groups will be joined by one or more folks from DataCite
who will be able to help with metadata questions like this one.
But I think no decision has been made and implemented because even the
incredibly helpful guide at
https://support.datacite.org/docs/contributing-citations-and-references
leaves enough room for debate and the community hasn't found the time to
build consensus.
I've been imagining that as the GREI work continues, the Harvard folks in
the working groups, including me, can learn from more of the Dataverse
community (something @qqmyers <https://github.com/qqmyers> also recommended
in a related pull request
<#8357 (comment)>),
and what we learn can inform the GREI work (and perhaps DataCite's
recommendations).
@amberleahey <https://github.com/amberleahey> and @philippconzett
<https://github.com/philippconzett> asked in a Google Groups thread
<https://groups.google.com/g/dataverse-community/c/gbRz1VseiFw/m/YayS3ac2BgAJ>
last week if a Dataverse Metadata WG/IG meeting could be scheduled to
discuss this. I'll also ping @mreekie <https://github.com/mreekie>, who
is catching up on this and other related issues.
—
Reply to this email directly, view it on GitHub
<#2778 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACTBP6CA6MQ3N4K6AZMML4TWAMTSBANCNFSM4BVNZ4VQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@jggautier Should we schedule a Metadata WG/IG meeting on Thursday October 6? I think we could discuss several related issues:
I'll be out of office in a couple of hours and until Tuesday, but if you and @qqmyers and others are available, I could join you after 1500 CEST. Thanks! |
I think that would be very useful. I would be happy to join.
M
… On 29 Sep 2022, at 06:35, Philipp Conzett ***@***.***> wrote:
@jggautier Should we schedule a Metadata WG/IG meeting on Thursday October 6? I think we could discuss several related issues:
relationType (see this GitHub issue)
resourceType (see #5086)
rightsList (see #8512)
probably other issues
I'll be out of office in a couple of hours and until Tuesday, but if you and @qqmyers and others are available, I could join you after 1500 CEST. Thanks!
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.
|
I'd second that idea of having a meeting! |
Hey @philippconzett. Sorry for the delay in replying. For a few reasons I've been hesitant to agree that a meeting should be scheduled, but I'd be happy to help promote one for this Thursday, Oct 6. 2pm UTC (time converter). I made a Google Doc at https://docs.google.com/document/d/1tNnvVh8jYY1g53BEwpJmMmm9w6Vgy_Q7RrmFjGnYOyA for note taking. Looks like we need a new Slack link. The one in the calendar invite doesn't work. Was probably Danny Brooke's. |
It is too early for me on the Wst Coast and I will rely on @amberleahey
***@***.***> to attend :)
E.
|
Count me in, let's build some consensus on DataCite mappings and put it to a community vote! |
Hi! I might be unable to attend due to a change in my agenda for the 6th. However, my use case is, I believe, clearly described in #2778 (comment) |
Thanks, @jggautier! I have posted a message in #ig-metadata on Slack. Are you going to create a Zoom link, or do you want me to do that? |
Thanks for posting in Slack. I created a Zoom link and updated the event on the Dataverse Community Calendar. |
Please forward the Zoom link, just in case I can join…
Thank you!
M
… On 4 Oct 2022, at 16:03, Julian Gautier ***@***.***> wrote:
Thanks for posting in Slack. I created a Zoom link and updated the event on the Dataverse Community Calendar.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.
|
Join Zoom meeting
International numbers available: https://harvard.zoom.us/u/a3oPwvAlb |
me too. But we have a Software Carpentry Workshop on Oct, 6th...
I can try to find a replacement for me, I'm only a helper at this day, if the date fits for everyone else
…__
Dr. Dorothea Iglezakis
FoKUS - Kompetenzzentrum für Forschungsdaten
IZUS/Universitätsbibliothek Stuttgart
Holzgartenstr. 16
70174 Stuttgart
Tel.: 0711/685-83648
Email: ***@***.***
________________________________
Von: Oliver Bertuch ***@***.***>
Gesendet: Donnerstag, 29. September 2022 08:24:59
An: IQSS/dataverse
Cc: Iglezakis, Dorothea; Mention
Betreff: Re: [IQSS/dataverse] Add "Relation Type" to related publication metadata fields to send DataCite related publication metadata (#2778)
I'd second that idea of having a meeting!
—
Reply to this email directly, view it on GitHub<#2778 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AJCGXPFZSUHNHD4OTZJZ6KLWAUY3XANCNFSM4BVNZ4VQ>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi @doigl. I'm guessing this was a very delayed message? And I believe you were able to make it to the meeting on Oct 6 after all, right? I thought I saw you there :) |
Apologies for the delay weighing in here; I had missed a few notifications and then was out of office last week! I've reviewed the notes from the metadata interest group call and support the idea of having a user-selectable relation type, rather than assuming all "Related Publications" are citations of the dataset (for example). A couple thoughts on this:
On DataCite's side: we're looking at producing better guidance and examples in the near term; this will involve analyzing existing usage and community consultation, and we definitely want to hear from the Dataverse community. Will keep you updated as that work gets underway! |
To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'. If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment. |
The original idea in this issue... There's lots of other chatter in this issue that may not be addressed by it. Please feel free to open new issues for the rest! 😅 |
comment from @bencomp in #2774
The text was updated successfully, but these errors were encountered: