diff --git a/_sources/index.rst.txt b/_sources/index.rst.txt
index 0f57de54..1e1fca61 100644
--- a/_sources/index.rst.txt
+++ b/_sources/index.rst.txt
@@ -3,8 +3,8 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
-Welcome to Knowledge Commons Works's documentation!
-===================================================
+Welcome to the Knowledge Commons Works technical documentation!
+===============================================================
.. toctree::
:maxdepth: 2
@@ -22,9 +22,9 @@ Welcome to Knowledge Commons Works's documentation!
in_depth
reference
-Indices and tables
-==================
+.. Indices and tables
+.. ==================
-* :ref:`genindex`
-* :ref:`modindex`
-* :ref:`search`
+.. * :ref:`genindex`
+.. * :ref:`modindex`
+.. * :ref:`search`
diff --git a/_sources/metadata.md.txt b/_sources/metadata.md.txt
index b653bf5c..60dd2c77 100644
--- a/_sources/metadata.md.txt
+++ b/_sources/metadata.md.txt
@@ -1,4 +1,4 @@
-# Metadata Schema and Vocabularies
+# Metadata Schema, Vocabularies, and Identifiers
The default metadata schema for InvenioRDM records is defined in the `invenio-rdm-records` package and documented [here](https://inveniordm.docs.cern.ch/reference/metadata/). It also includes a number of optional metadata fields which have been enabled in KCWorks, documented [here](https://inveniordm.docs.cern.ch/reference/metadata/optional_metadata/).
@@ -7,7 +7,9 @@ Beyond these InvenioRDM fields, KCWorks adds a number of custom metadata fields
- `kcr`: custom fields that are used to store data from the KC system. These fields **may** be used for new data, but are not required.
- `hclegacy`: custom fields that are used to store data from the legacy CORE repository. These fields **must not** be used for new data.
-## Example JSON record
+## Example metadata record
+
+### JSON object for record creation
What follows is an example of a complete metadata record (JSON object) used to create a KCWorks record. The various fields and their possible values are described in the sections below.
@@ -247,26 +249,93 @@ Note that no single actual record would include all of these fields. The example
}
```
+### JSON object retrieved from the record API
+
+The JSON object retrieved from the record API shares the same basic structure as the JSON object used to create the record, except that it includes a number of additional fields. Some properties are also filled out with additional details (e.g., readable titles for licenses, etc.)
+
## Controlled Vocabularies
### Subject headings
#### FAST
-The FAST controlled vocabulary (https://www.oclc.org/research/areas/data-science/fast.html) is used for the `subjects` field.
+The FAST controlled vocabulary (https://www.oclc.org/research/areas/data-science/fast.html) is used for the `subjects` field. See the [metadata.subjects](#metadata.subjects) section for more information about how to include FAST subjects in a KCWorks record.
#### Homosaurus
-The FAST vocabulary is augmented in KCWorks by the Homosaurus vocabulary (https://homosaurus.org/) for subjects related to sexuality and gender identity.
+The FAST vocabulary is augmented in KCWorks by the Homosaurus vocabulary (https://homosaurus.org/) for subjects related to sexuality and gender identity. See the [metadata.subjects](#metadata.subjects) section for information about how to include Homosaurus subjects in a KCWorks record.
+
+#### Resource types
+
+#### Creator/contributor roles
+
+## Identifier Schemes
+
+### Works
+
+#### DOI
+
+KCWorks (and InvenioRDM) supports the DOI identifier scheme to identify works in the repository. Note that two DOIs are minted for each KCWorks record: one for the current version of the record, and one for the work as a whole (including all versions). The version-specific DOI is stored in the `pids` property of the metadata record (`pids.identifiers.doi`). The work DOI is stored in the `parent.pids.doi` property of the `parent` object.
+
+These DOIs are minted by DataCite (https://datacite.org/) and the attached metadata is maintained automatically by KCWorks.
+
+Additional DOIs minted elsewhere can be attached to a KCWorks record. If provided at record creation such external DOIs can be used as the record's primary identifier (in `pids.doi`). Otherwise, they can be added using the `identifiers` property of the metadata record using the scheme `alternate-doi`. In both cases, these externally minted DOIs are **not** maintained automatically by KCWorks.
+
+#### OAI
+
+KCWorks also supports the OAI identifier scheme. The OAI identifier for a KCWorks record is stored in the `pids` property of the metadata record (`pids.identifiers.oai`).
+
+#### ISSN
+
+#### ISBN
+
+### People
+
+#### ORCID (recommended)
+
+KCWorks (and InvenioRDM) supports the ORCID identifier scheme. The ORCID of the submitter of the KCWorks record is stored in the `person_or_org.identifiers` property of the `creators` array (`creators[0].person_or_org.identifiers.identifier`). A KCWorks user's ORCID id is also drawn from their KC profile (if they have provided one) and stored in their system user profile (as `.user_profile.identifier_orcid`).
+
+For details on how to use ORCID identifiers in KCWorks, see the section on [Metadata.creators](#metadata.creators) below.
+
+#### KC Username (recommended)
+
+KCWorks also allows the use of Knowledge Commons usernames as identifiers. The KC username of the submitter of the KCWorks record is stored in the `person_or_org.identifiers` property of the `creators` array (`creators[0].person_or_org.identifiers.identifier`) using the scheme `kc_username`.
+
+For details on how to use KC usernames in KCWorks, see the section on [Metadata.creators](#metadata.creators) below.
+
+#### GND
+
+KCWorks also supports the Integrated Authority File (GND) identifier scheme (https://www.dnb.de/EN/Professionell/Standardisierung/GND/gnd_node.html). The GND identifier of the submitter of the KCWorks record is stored in the `person_or_org.identifiers` property of the `creators` array (`creators[0].person_or_org.identifiers.identifier`) using the scheme `gnd`.
+
+#### ISNI
+
+KCWorks also supports the ISNI identifier scheme (https://isni.org/). The ISNI of the submitter of the KCWorks record is stored in the `person_or_org.identifiers` property of the `creators` array (`creators[0].person_or_org.identifiers.identifier`) using the scheme `isni`.
### Organizations
-#### ROR
+#### ROR (recommended)
+
+Organization identifiers can appear in the `creators` and `contributors` arrays, either for organizational creators/contributors or in the `affiliations` array of a personal creator/contributor. These fields *may* identify an organization using its id in Research Organization Registry (https://ror.org/) using the scheme `ror`, although free text names are also supported.
+
+#### Grid (deprecated)
+
+KCWorks also supports the Grid identifier scheme (https://www.grid.ac/) for organizations using the scheme `grid`. This scheme is deprecated in favour of ROR, however, and should not be used for new identifiers.
+
+#### GND
+
+KCWorks also supports the Integrated Authority File (GND) identifier scheme (https://www.dnb.de/EN/Professionell/Standardisierung/GND/gnd_node.html) for organizations using the scheme `gnd`.
+
+### Funders
+
+#### DOI
+
+Funders in the `metadata.funding` array can be identified using DOIs and the scheme `doi`.
-The Research Organization Registry (https://ror.org/) is used for the `organizations` field.
+#### OFR
+Funders in the `metadata.funding` array can also be identified using the Open Funder Registry (https://openfunder.org/) identifiers and the scheme `ofr`.
-## Notes about Implementation of Core InvenioRDM Fields
+## KCWorks Implementation of Core InvenioRDM Fields
### metadata.subjects
@@ -332,9 +401,9 @@ Example:
}
```
-### KCWorks Custom Fields (kcworks/site/metadata_fields)
+## KCWorks Custom Fields (kcworks/site/metadata_fields)
-#### kcr:ai_usage
+### kcr:ai_usage
Type: `Object[boolean, string]`
@@ -350,7 +419,7 @@ Example:
}
```
-#### kcr:media
+### kcr:media
Type: `Array[string]`
@@ -363,7 +432,7 @@ Example:
}
```
-#### kcr:commons_domain
+### kcr:commons_domain
Type: `string`
@@ -376,7 +445,7 @@ Example:
}
```
-#### kcr:chapter_label
+### kcr:chapter_label
Type: `string`
@@ -389,7 +458,7 @@ Example:
}
```
-#### kcr:content_warning
+### kcr:content_warning
Type: `string`
@@ -402,7 +471,7 @@ Example:
}
```
-#### kcr:course_title
+### kcr:course_title
Type: `string`
@@ -415,7 +484,7 @@ Example:
}
```
-#### kcr:degree
+### kcr:degree
Type: `string`
@@ -428,7 +497,7 @@ Example:
}
```
-#### kcr:discipline
+### kcr:discipline
Type: `string`
@@ -445,7 +514,7 @@ Example:
}
```
-#### kcr:edition
+### kcr:edition
Type: `string`
@@ -458,7 +527,7 @@ Example:
}
```
-#### kcr:meeting_organization
+### kcr:meeting_organization
Type: `string`
@@ -471,7 +540,7 @@ Example:
}
```
-#### kcr:project_title
+### kcr:project_title
Type: `string`
@@ -484,7 +553,7 @@ Example:
}
```
-#### kcr:publication_url
+### kcr:publication_url
Type: `string` (URL)
@@ -499,7 +568,7 @@ Example:
}
```
-#### kcr:sponsoring_institution
+### kcr:sponsoring_institution
Type: `string`
@@ -514,7 +583,7 @@ Example:
}
```
-#### kcr:submitter_email
+### kcr:submitter_email
Type: `string` (email address)
@@ -527,7 +596,7 @@ Example:
}
```
-#### kcr:submitter_username
+### kcr:submitter_username
Type: `string`
@@ -540,7 +609,7 @@ Example:
}
```
-#### kcr:institution_department
+### kcr:institution_department
Type: `string`
@@ -553,7 +622,7 @@ Example:
}
```
-#### kcr:book_series
+### kcr:book_series
Type: `Object[string, string]`
@@ -570,7 +639,7 @@ Example:
}
```
-#### kcr:user_defined_tags
+### kcr:user_defined_tags
Type: `Array[string]`
@@ -586,14 +655,14 @@ Example:
}
```
-#### kcr:commons_search_recid (system field)
+### kcr:commons_search_recid (system field)
This field is used to store the persistent identifier for the KCWorks record in the KC central search index.
> [!Warning]
> This field is automatically generated by the `invenio-remote-api-provisioner` service when a KCWorks record is published. It *must not* be set by the user.
-#### kcr:commons_search_updated (system field)
+### kcr:commons_search_updated (system field)
Type: `string` (ISO 8601 datetime string)
@@ -602,11 +671,11 @@ This field stores the date and time when the KCWorks record was last updated in
> [!Warning]
> This field is automatically generated by the `invenio-remote-api-provisioner` service when a KCWorks record is published. It *must not* be set by the user.
-### HC Legacy Custom Fields
+## HC Legacy Custom Fields
The `hclegacy` namespace is used for custom fields that are used to store data from the legacy CORE database. These fields should not be used for new data.
-#### custom_fields.hclegacy:groups_for_deposit
+### custom_fields.hclegacy:groups_for_deposit
Type: `Array[Object[string, string]]`
@@ -624,7 +693,7 @@ Example:
}
```
-#### custom_fields.hclegacy:collection
+### custom_fields.hclegacy:collection
Type: `string`
@@ -637,7 +706,7 @@ Example:
}
```
-#### custom_fields.hclegacy:committee_deposit
+### custom_fields.hclegacy:committee_deposit
Type: `integer`
@@ -650,7 +719,7 @@ Example:
}
```
-#### custom_fields.hclegacy:file_location
+### custom_fields.hclegacy:file_location
Type: `string`
@@ -663,7 +732,7 @@ Example:
}
```
-#### custom_fields.hclegacy:file_pid
+### custom_fields.hclegacy:file_pid
Type: `string`
@@ -676,7 +745,7 @@ Example:
}
```
-#### custom_fields.hclegacy:previously_published
+### custom_fields.hclegacy:previously_published
Type: `string`
@@ -689,7 +758,7 @@ Example:
}
```
-#### custom_fields.hclegacy:publication_type
+### custom_fields.hclegacy:publication_type
Type: `string`
@@ -702,7 +771,7 @@ Example:
}
```
-#### custom_fields.hclegacy:record_change_date
+### custom_fields.hclegacy:record_change_date
Type: `string` (ISO 8601 datetime string)
@@ -715,7 +784,7 @@ Example:
}
```
-#### custom_fields.hclegacy:record_creation_date
+### custom_fields.hclegacy:record_creation_date
Type: `string` (ISO 8601 datetime string)
@@ -728,7 +797,7 @@ Example:
}
```
-#### custom_fields.hclegacy:record_identifier
+### custom_fields.hclegacy:record_identifier
Type: `string`
@@ -741,7 +810,7 @@ Example:
}
```
-#### custom_fields.hclegacy:submitter_org_memberships
+### custom_fields.hclegacy:submitter_org_memberships
Type: `array[string]`
@@ -754,7 +823,7 @@ Example:
}
```
-#### custom_fields.hclegacy:submitter_affiliation
+### custom_fields.hclegacy:submitter_affiliation
Type: `string`
@@ -767,7 +836,7 @@ Example:
}
```
-#### custom_fields.hclegacy:submitter_id
+### custom_fields.hclegacy:submitter_id
Type: `string`
@@ -780,7 +849,7 @@ Example:
}
```
-#### custom_fields.hclegacy:total_views
+### custom_fields.hclegacy:total_views
Type: `integer`
@@ -793,7 +862,7 @@ Example:
}
```
-#### custom_fields.hclegacy:total_downloads
+### custom_fields.hclegacy:total_downloads
Type: `integer`
diff --git a/cli_commands.html b/cli_commands.html
index f7ef46b0..15b0c828 100644
--- a/cli_commands.html
+++ b/cli_commands.html
@@ -203,7 +203,7 @@
The default metadata schema for InvenioRDM records is defined in the invenio-rdm-records package and documented here. It also includes a number of optional metadata fields which have been enabled in KCWorks, documented here.
Beyond these InvenioRDM fields, KCWorks adds a number of custom metadata fields to the schema using InvenioRDM’s custom field mechanism. These are all located in the top-level custom_fields field of the record metadata. They are prefixed with two different namespaces:
kcr: custom fields that are used to store data from the KC system. These fields may be used for new data, but are not required.
hclegacy: custom fields that are used to store data from the legacy CORE repository. These fields must not be used for new data.
What follows is an example of a complete metadata record (JSON object) used to create a KCWorks record. The various fields and their possible values are described in the sections below.
Note that no single actual record would include all of these fields. The example is provided to illustrate the structure of the metadata record and the sort of values that are valid for each field.
The JSON object retrieved from the record API shares the same basic structure as the JSON object used to create the record, except that it includes a number of additional fields. Some properties are also filled out with additional details (e.g., readable titles for licenses, etc.)
The FAST controlled vocabulary (https://www.oclc.org/research/areas/data-science/fast.html) is used for the subjects field.
+
The FAST controlled vocabulary (https://www.oclc.org/research/areas/data-science/fast.html) is used for the subjects field. See the metadata.subjects section for more information about how to include FAST subjects in a KCWorks record.
The FAST vocabulary is augmented in KCWorks by the Homosaurus vocabulary (https://homosaurus.org/) for subjects related to sexuality and gender identity.
+
The FAST vocabulary is augmented in KCWorks by the Homosaurus vocabulary (https://homosaurus.org/) for subjects related to sexuality and gender identity. See the metadata.subjects section for information about how to include Homosaurus subjects in a KCWorks record.
KCWorks (and InvenioRDM) supports the DOI identifier scheme to identify works in the repository. Note that two DOIs are minted for each KCWorks record: one for the current version of the record, and one for the work as a whole (including all versions). The version-specific DOI is stored in the pids property of the metadata record (pids.identifiers.doi). The work DOI is stored in the parent.pids.doi property of the parent object.
+
These DOIs are minted by DataCite (https://datacite.org/) and the attached metadata is maintained automatically by KCWorks.
+
Additional DOIs minted elsewhere can be attached to a KCWorks record. If provided at record creation such external DOIs can be used as the record’s primary identifier (in pids.doi). Otherwise, they can be added using the identifiers property of the metadata record using the scheme alternate-doi. In both cases, these externally minted DOIs are not maintained automatically by KCWorks.
KCWorks also supports the OAI identifier scheme. The OAI identifier for a KCWorks record is stored in the pids property of the metadata record (pids.identifiers.oai).
KCWorks (and InvenioRDM) supports the ORCID identifier scheme. The ORCID of the submitter of the KCWorks record is stored in the person_or_org.identifiers property of the creators array (creators[0].person_or_org.identifiers.identifier). A KCWorks user’s ORCID id is also drawn from their KC profile (if they have provided one) and stored in their system user profile (as <user_object>.user_profile.identifier_orcid).
+
For details on how to use ORCID identifiers in KCWorks, see the section on Metadata.creators below.
KCWorks also allows the use of Knowledge Commons usernames as identifiers. The KC username of the submitter of the KCWorks record is stored in the person_or_org.identifiers property of the creators array (creators[0].person_or_org.identifiers.identifier) using the scheme kc_username.
+
For details on how to use KC usernames in KCWorks, see the section on Metadata.creators below.
KCWorks also supports the Integrated Authority File (GND) identifier scheme (https://www.dnb.de/EN/Professionell/Standardisierung/GND/gnd_node.html). The GND identifier of the submitter of the KCWorks record is stored in the person_or_org.identifiers property of the creators array (creators[0].person_or_org.identifiers.identifier) using the scheme gnd.
KCWorks also supports the ISNI identifier scheme (https://isni.org/). The ISNI of the submitter of the KCWorks record is stored in the person_or_org.identifiers property of the creators array (creators[0].person_or_org.identifiers.identifier) using the scheme isni.
Organization identifiers can appear in the creators and contributors arrays, either for organizational creators/contributors or in the affiliations array of a personal creator/contributor. These fields may identify an organization using its id in Research Organization Registry (https://ror.org/) using the scheme ror, although free text names are also supported.
KCWorks also supports the Grid identifier scheme (https://www.grid.ac/) for organizations using the scheme grid. This scheme is deprecated in favour of ROR, however, and should not be used for new identifiers.
KCWorks also supports the Integrated Authority File (GND) identifier scheme (https://www.dnb.de/EN/Professionell/Standardisierung/GND/gnd_node.html) for organizations using the scheme gnd.
Note that KCWorks employs the FAST controlled vocabulary (https://www.oclc.org/research/areas/data-science/fast.html) for the subjects field, complemented by the Homosaurus vocabulary (https://homosaurus.org/).
This field stores a list of media or materials involved in the creation of the record. This field is used to store free-form user-defined descriptors of the media or materials and does not impose any controlled vocabulary.
This field stores the KC organizational (Commons) domain associated with the KCWorks record, if any. The record should also be placed in the KCWorks collection associated with this organization.
This field stores the label of the chapter associated with the KCWorks record, if any. This allows us to differentiate between a simple chapter label (e.g. “Chapter 1”) and a more substantive title for the same chapter (e.g., “The Role of AI in Modern Art”).
This field stores an optional content warning for the KCWorks record. This is used to flag the record for KCWorks users so that they can be aware of potentially problematic content in the record. This field is not to be used for content moderation by KCWorks moderators or admins. It is only to be used voluntarily and as desired by the record submitter.
This field stores the title of the course associated with the KCWorks record. It is intended primarily for use with syllabi and instructional materials.
This field stores the educational degree (e.g., PhD, DPhil, MA, etc.) associated with the KCWorks record. It is intended primarily for use with theses and dissertations.
This field stores the academic discipline associated with the KCWorks record. It is intended primarily for use with theses, dissertations, and other educational artifacts. It is not intended as a general-purpose field for describing the subject matter of the KCWorks record. For that, you should use the metadata.subjects and kcr:user_defined_tags fields.
This field is intended to complement the thesis:university and kcr:institution_department fields.
This field stores the name of the organization associated with the meeting or conference associated with the KCWorks record. It is intended primarily for use with conference papers, presentations, proceedings, etc.
This field stores the title of a project for which the KCWorks record was created. It can be used flexibly for, e.g., grant-funded projects, research projects, artistic projects, etc.
This field stores the URL of the publication associated with the KCWorks record. It is not the URL of the KCWorks record itself or of the work it contains. For example, if the KCWorks record contains a journal article, it would not hold the URL for the published journal article. It is intended to hold the URL of the publication as a whole that the KCWorks record is based on or is a part of. So it might hold the main URL for the journal in which the article was published, or the main URL for the book in which the chapter was published, etc.
This field stores the name of the institution that sponsored the KCWorks record. One intended use is for unpublished materials such white papers that were sponsored or commissioned by an institution. The field may also be used for the institution hosting a conference or workshop associated with the KCWorks record (as distinct from the organization that sponsored the event).
Note that this field is not intended for the degree-granting institution associated with a thesis or dissertation. That institution’s title should be stored in the thesis:university field.
This field stores the KC username of the submitter of the KCWorks record. This should be used even if the submitter is also a contributor to the KCWorks record and has included the same username in the metadata.creators.person_or_org.identifiers array.
This field stores the institutional department in which a thesis, dissertation, or other educational artifact was produced. It is intended to complement the thesis:university field, which stores the degree-granting institution.
This field stores a list of user-defined tags for the KCWorks record. Unlike the metadata.subjects field, these tags are not constrained by any controlled vocabulary. Items should be free-form strings that describe the KCWorks record in a way that is not covered by the metadata.subjects field.
The hclegacy namespace is used for custom fields that are used to store data from the legacy CORE database. These fields should not be used for new data.
This field is used to store the groups to which a legacy CORE record belonged before import into KCWorks. It was used to create corresponding KCWorks collections during migration.
This field is used to store the org collection to which a legacy CORE record belonged before import into KCWorks. It was used to create corresponding KCWorks org collections during migration.
This field is used to store the committee deposit number for a legacy CORE record. It was not used during migration and is only preserved for historical purposes. It should not be used for new data.
This field is used to store the relative path the the file for a legacy CORE record. It was not used during migration and is only preserved for historical purposes. It should not be used for new data.
This field is used to store the persistent identifier for the file for a legacy CORE record. It was not used during migration and is only preserved for historical purposes. It should not be used for new data.
This field is used to store the previously published status for a legacy CORE record. It was not used during migration and is only preserved for historical purposes. It should not be used for new data.
This field is used to store the publication type for a legacy CORE record. It was used during migration to help determine the KCWorks resource type of the record. It is only preserved for historical purposes. It should not be used for new data.
This field is used to store the date of the last change to a legacy CORE record. It was not used during migration to KCWorks and is only preserved for historical purposes. It should not be used for new data.
This field is used to store the date of the creation of a legacy CORE record. It was not used during migration because InvenioRDM does not allow overriding of the record creation date. It is only preserved for historical purposes and should not be used for new data.
This field is used to store the internal system identifier for a legacy CORE record. It was not used during migration and is only preserved for historical purposes. It should not be used for new data.
This field is used to store the organizations to which a legacy CORE record’s submitter belonged before import into KCWorks. It was used to create corresponding KCWorks org collections during migration and assign the work to those org collections.
This field is used to store the organizational affiliation of a legacy CORE record’s submitter at the time of import into KCWorks. It was not used during migration and is only preserved for historical purposes. It should not be used for new data.
This field is used to store the internal KC system user id of a legacy CORE record’s submitter. It was used during migration to assign ownership of the newly created record, and is preserved for historical purposes. It should not be used for new data.
This field is used to store the total number of views for a legacy CORE record prior to import into KCWorks. It was used during migration to create KCWorks usage stats aggregations for the record. It is only preserved for historical purposes. It should not be used for new data.
This field is used to store the total number of downloads for a legacy CORE record prior to import into KCWorks. It was used during migration to create KCWorks usage stats aggregations for the record. It is only preserved for historical purposes. It should not be used for new data.