Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

9. Language — DataCite Metadata Schema 4.5 documentation #18

Open
utterances-bot opened this issue Oct 7, 2022 · 4 comments
Open

Comments

@utterances-bot
Copy link

9. Language — DataCite Metadata Schema 4.5 documentation

https://datacite-metadata-schema.readthedocs.io/en/4.5_draft/properties/recommended_optional/property_language.html

Copy link

It may be a good idea to explicitly mention 3-letter language codes, for rarer and extinct languages, which will be used in academia, and to avoid coders implementing this field as a 2-character string. There are systems that use even more characters, but I am not sure how much these systems are used.

Allowing for documents to have multiple languages would be helpful. If the same text is available in multiple languages in one document it may not be obvious which the "primary" language is. When the social media system Mastodon (Twitter alternative) recently added a feature to indicate the language of a post people complained that it only allowed for one language, even for such short posts this was seen as constraining.

Copy link

We would like to use this property as a repeatable. As repository managers, we often deal with articles, thesis and dissertations that are written in more than one language. And repeatable language element is already a part of the OpenAIRE Guidelines: "If necessary, repeat this element to indicate multiple languages." https://openaire-guidelines-for-literature-repository-managers.readthedocs.io/en/v4.0.0/field_language.html.

And on the Victor's comment above - 3-letter language codes - for indigenous languages, e.g. Quechua, which has variations by region of the country, the 3-letter abbreviation works better (according to ISO 639-3 standard). E.g see ALICIA repository Guide on the -page 31 http://repositorio.concytec.gob.pe/bitstream/20.500.12390/2231/1/VERSI%C3%93N%20FINAL%20-%20GUIA%20ALICIA%202.0.1%20-%20ENERO%202021.pdf).

(On behalf of COAR Task Force on Supporting Multilingualism and non-English Content in Repositories https://www.coar-repositories.org/news-updates/what-we-do/multilingual-and-non-english-content/)

@KellyStathis
Copy link
Collaborator

Hi @VictorVenema and @irynakuchma, I wanted to follow up on this thread since we've recently drafted changes related to this topic!

The DataCite Metadata Working Group has shared a new Request for Comments with several proposed schema changes. This proposal includes making the Language property repeatable and recommended, as well as documentation updates to clarify supported language codes.

Details on how to provide feedback are included in the draft proposal document and summarized in this blog post: https://doi.org/10.5438/cvnf-vs86

We welcome your input on the Request for Comments through May 6, 2024. Should you have any questions, please let me know.

@irynakuchma
Copy link

Wonderful news, thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants