DANS-KNAW
diff --git a/‎doc/release-notes/11217-source-name-harvesting-client.md
+13 b/‎doc/release-notes/11217-source-name-harvesting-client.md
+13
diff --git a/‎doc/sphinx-guides/source/_static/api/harvesting-client.json
+11 b/‎doc/sphinx-guides/source/_static/api/harvesting-client.json
+11
diff --git a/‎doc/sphinx-guides/source/api/native-api.rst
+57-33 b/‎doc/sphinx-guides/source/api/native-api.rst
+57-33
diff --git a/‎doc/sphinx-guides/source/installation/config.rst
+3 b/‎doc/sphinx-guides/source/installation/config.rst
+3
diff --git a/‎docker-compose-dev.yml
+1 b/‎docker-compose-dev.yml
+1
@@ -0,0 +1,13 @@
+### Metadata Source Facet Can Now Differentiate Between Harvested Sources
+
+The behavior of the feature flag `index-harvested-metadata-source` and the "Metadata Source" facet, which were added and updated, respectively, in [Dataverse 6.3](https://github.com/IQSS/dataverse/releases/tag/v6.3) (through pull requests #10464 and #10651), have been updated. A new field called "Source Name" has been added to harvesting clients.
+
+Before Dataverse 6.3, all harvested content (datasets and files) appeared together under "Harvested" under the "Metadata Source" facet. This is still the behavior of Dataverse out of the box. Since Dataverse 6.3, enabling the `index-harvested-metadata-source` feature flag (and reindexing) resulted in harvested content appearing under the nickname for whatever harvesting client was used to bring in the content. This meant that instead of having all harvested content lumped together under "Harvested", content would appear under "client1", "client2", etc.
+
+Now, as this release, enabling the `index-harvested-metadata-source` feature flag, populating a new field for harvesting clients called "Source Name" ("sourceName" in the [API](https://dataverse-guide--11217.org.readthedocs.build/en/11217/api/native-api.html#create-a-harvesting-client)), and reindexing (see upgrade instructions below), results in the source name appearing under the "Metadata Source" facet rather than the harvesting client nickname. This gives you more control over the name that appears under the "Metadata Source" facet and allows you to group harvested content from various harvesting clients under the same name if you wish (by reusing the same source name).
+
+Previously, `index-harvested-metadata-source` was not documented in the guides, but now you can find information about it under [Feature Flags](https://dataverse-guide--11217.org.readthedocs.build/en/11217/installation/config.html#feature-flags). See also #10217 and #11217.
+
+## Upgrade instructions
+
+If you have enabled the `dataverse.feature.index-harvested-metadata-source` feature flag and given some of your harvesting clients a source name, you should reindex to have those source names appear under the "Metadata Source" facet.
@@ -0,0 +1,11 @@
+{
+  "nickName": "zenodo",
+  "dataverseAlias": "zenodoHarvested",
+  "harvestUrl": "https://zenodo.org/oai2d",
+  "archiveUrl": "https://zenodo.org",
+  "archiveDescription": "Moissonné depuis la collection LMOPS de l'entrepôt Zenodo. En cliquant sur ce jeu de données, vous serez redirigé vers Zenodo.",
+  "metadataFormat": "oai_dc",
+  "customHeaders": "x-oai-api-key: xxxyyyzzz",
+  "set": "user-lmops",
+  "allowHarvestingMissingCVV":true
+}
@@ -5556,7 +5556,7 @@ Create a Harvesting Set
 
 To create a harvesting set you must supply a JSON file that contains the following fields: 
 
-- Name: Alpha-numeric may also contain -, _, or %, but no spaces. Must also be unique in the installation.
+- Name: Alpha-numeric may also contain -, _, or %, but no spaces. It must also be unique in the installation.
 - Definition: A search query to select the datasets to be harvested. For example, a query containing authorName:YYY would include all datasets where ‘YYY’ is the authorName.
 - Description: Text that describes the harvesting set. The description appears in the Manage Harvesting Sets dashboard and in API responses. This field is optional.
 
@@ -5652,20 +5652,43 @@ The following API can be used to create and manage "Harvesting Clients". A Harve
 List All Configured Harvesting Clients
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-Shows all the Harvesting Clients configured::
+Shows all the harvesting clients configured.
 
-  GET http://$SERVER/api/harvest/clients/
+.. note:: See :ref:`curl-examples-and-environment-variables` if you are unfamiliar with the use of export below.
+
+.. code-block:: bash
+
+  export SERVER_URL=https://demo.dataverse.org
+
+  curl "$SERVER_URL/api/harvest/clients"
+
+The fully expanded example above (without the environment variables) looks like this:
+
+.. code-block:: bash
+
+  curl "https://demo.dataverse.org/api/harvest/clients"
 
 Show a Specific Harvesting Client
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-Shows a Harvesting Client with a defined nickname::
+Shows a harvesting client by nickname.
 
-  GET http://$SERVER/api/harvest/clients/$nickname
+.. code-block:: bash
+
+  export SERVER_URL=https://demo.dataverse.org
+  export NICKNAME=myclient
+
+  curl "$SERVER_URL/api/harvest/clients/$NICKNAME"
+
+The fully expanded example above (without the environment variables) looks like this:
 
 .. code-block:: bash
 
-  curl "http://localhost:8080/api/harvest/clients/myclient"
+  curl "https://demo.dataverse.org/api/harvest/clients/myclient"
+
+The output will look something like the following.
+
+.. code-block:: bash
 
   {
     "status":"OK",
@@ -5681,6 +5704,7 @@ Shows a Harvesting Client with a defined nickname::
         "type": "oai",
         "dataverseAlias": "fooData",
         "nickName": "myClient",
+        "sourceName": "",
         "set": "fooSet",
 	"useOaiIdentifiersAsPids": false
         "schedule": "none",
@@ -5694,16 +5718,12 @@ Shows a Harvesting Client with a defined nickname::
     }
 
 
+.. _create-a-harvesting-client:
+
 Create a Harvesting Client
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-To create a new harvesting client::
-
-  POST http://$SERVER/api/harvest/clients/$nickname
-
-``nickName`` is the name identifying the new client. It should be alpha-numeric and may also contain -, _, or %, but no spaces. Must also be unique in the installation.
   
-You must supply a JSON file that describes the configuration, similarly to the output of the GET API above. The following fields are mandatory:
+To create a harvesting client you must supply a JSON file that describes the configuration, similarly to the output of the GET API above. The following fields are mandatory:
 
 - dataverseAlias: The alias of an existing collection where harvested datasets will be deposited
 - harvestUrl: The URL of the remote OAI archive
@@ -5712,6 +5732,7 @@ You must supply a JSON file that describes the configuration, similarly to the o
 
 The following optional fields are supported:
 
+- sourceName: When ``index-harvested-metadata-source`` is enabled (see :ref:`feature-flags`), sourceName will override the nickname in the Metadata Source facet. It can be used to group the content from many harvesting clients under the same name.
 - archiveDescription: What the name suggests. If not supplied, will default to "This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data."
 - set: The OAI set on the remote server. If not supplied, will default to none, i.e., "harvest everything".
 - style: Defaults to "default" - a generic OAI archive. (Make sure to use "dataverse" when configuring harvesting from another Dataverse installation).
@@ -5720,38 +5741,35 @@ The following optional fields are supported:
 - useOaiIdentifiersAsPids: Defaults to false; if set to true, the harvester will attempt to use the identifier from the OAI-PMH record header as the **first choice** for the persistent id of the harvested dataset. When set to false, Dataverse will still attempt to use this identifier, but only if none of the `<dc:identifier>` entries in the OAI_DC record contain a valid persistent id (this is new as of v6.5). 
 
 Generally, the API will accept the output of the GET version of the API for an existing client as valid input, but some fields will be ignored. For example, as of writing this there is no way to configure a harvesting schedule via this API. 
-  
-An example JSON file would look like this::
 
-  {
-    "nickName": "zenodo",
-    "dataverseAlias": "zenodoHarvested",
-    "harvestUrl": "https://zenodo.org/oai2d",
-    "archiveUrl": "https://zenodo.org",
-    "archiveDescription": "Moissonné depuis la collection LMOPS de l'entrepôt Zenodo. En cliquant sur ce jeu de données, vous serez redirigé vers Zenodo.",
-    "metadataFormat": "oai_dc",
-    "customHeaders": "x-oai-api-key: xxxyyyzzz",
-    "set": "user-lmops",
-    "allowHarvestingMissingCVV":true
-  }
+You can download this :download:`harvesting-client.json <../_static/api/harvesting-client.json>` file to use as a starting point.
 
-Something important to keep in mind about this API is that, unlike the harvesting clients GUI, it will create a client with the values supplied without making any attempts to validate them in real time. In other words, for the `harvestUrl` it will accept anything that looks like a well-formed url, without making any OAI calls to verify that the name of the set and/or the metadata format entered are supported by it. This is by design, to give an admin an option to still be able to create a client, in a rare case when it cannot be done via the GUI because of some real time failures in an exchange with an otherwise valid OAI server. This however puts the responsibility on the admin to supply the values already confirmed to be valid. 
+.. literalinclude:: ../_static/api/harvesting-client.json
 
+Something important to keep in mind about this API is that, unlike the harvesting clients GUI, it will create a client with the values supplied without making any attempts to validate them in real time. In other words, for the `harvestUrl` it will accept anything that looks like a well-formed url, without making any OAI calls to verify that the name of the set and/or the metadata format entered are supported by it. This is by design, to give an admin an option to still be able to create a client, in a rare case when it cannot be done via the GUI because of some real time failures in an exchange with an otherwise valid OAI server. This however puts the responsibility on the admin to supply the values already confirmed to be valid. 
 
 .. note:: See :ref:`curl-examples-and-environment-variables` if you are unfamiliar with the use of export below.
 
+
+``nickName`` in the JSON file and ``$NICKNAME`` in the URL path below is the name identifying the new client. It should be alpha-numeric and may also contain -, _, or %, but no spaces. It must be unique in the installation.
+
 .. code-block:: bash
 
   export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
   export SERVER_URL=http://localhost:8080
+  export NICKNAME=zenodo
 
-  curl -H "X-Dataverse-key:$API_TOKEN" -X POST -H "Content-Type: application/json" "$SERVER_URL/api/harvest/clients/zenodo" --upload-file client.json
+  curl -H "X-Dataverse-key:$API_TOKEN" -X POST -H "Content-Type: application/json" "$SERVER_URL/api/harvest/clients/$NICKNAME" --upload-file harvesting-client.json
 
 The fully expanded example above (without the environment variables) looks like this:
 
 .. code-block:: bash
 
-  curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST -H "Content-Type: application/json" "http://localhost:8080/api/harvest/clients/zenodo" --upload-file "client.json"
+  curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST -H "Content-Type: application/json" "http://localhost:8080/api/harvest/clients/zenodo" --upload-file "harvesting-client.json"
+
+The output will look something like the following.
+
+.. code-block:: bash
 
   {
     "status": "OK",
@@ -5785,15 +5803,21 @@ Similar to the API above, using the same JSON format, but run on an existing cli
 Delete a Harvesting Client
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-Self-explanatory:
-
 .. code-block:: bash
 
-  curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X DELETE "http://localhost:8080/api/harvest/clients/$nickName"
+  export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
+  export SERVER_URL=http://localhost:8080
+  export NICKNAME=zenodo
 
-Only users with superuser permissions may delete harvesting clients.
+  curl -H "X-Dataverse-key:$API_TOKEN" -X DELETE "$SERVER_URL/api/harvest/clients/$NICKNAME"
 
+The fully expanded example above (without the environment variables) looks like this:
+
+.. code-block:: bash
 
+  curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X DELETE "http://localhost:8080/api/harvest/clients/zenodo"
+
+Only users with superuser permissions may delete harvesting clients.
 
 .. _pids-api:
 
 
@@ -3493,6 +3493,9 @@ please find all known feature flags below. Any of these flags can be activated u
     * - globus-use-experimental-async-framework
       - Activates a new experimental implementation of Globus polling of ongoing remote data transfers that does not rely on the instance staying up continuously for the duration of the transfers and saves the state information about Globus upload requests in the database. Added in v6.4. Affects :ref:`:GlobusPollingInterval`. Note that the JVM option :ref:`dataverse.files.globus-monitoring-server` described above must also be enabled on one (and only one, in a multi-node installation) Dataverse instance. 
       - ``Off``
+    * - index-harvested-metadata-source
+      - Index the nickname or the source name (See the optional ``sourceName`` field in :ref:`create-a-harvesting-client`) of the harvesting client as the "metadata source" of harvested datasets and files. If enabled, the Metadata Source facet will show separate groupings of the content harvested from different sources (by harvesting client nickname or source name) instead of the default behavior where there is one "Harvested" grouping for all harvested content.
+      - ``Off``
 
 **Note:** Feature flags can be set via any `supported MicroProfile Config API source`_, e.g. the environment variable
 ``DATAVERSE_FEATURE_XXX`` (e.g. ``DATAVERSE_FEATURE_API_SESSION_AUTH=1``). These environment variables can be set in your shell before starting Payara. If you are using :doc:`Docker for development </container/dev-usage>`, you can set them in the `docker compose <https://docs.docker.com/compose/environment-variables/set-environment-variables/>`_ file.
 
@@ -17,6 +17,7 @@ services:
       SKIP_DEPLOY: "${SKIP_DEPLOY}"
       DATAVERSE_JSF_REFRESH_PERIOD: "1"
       DATAVERSE_FEATURE_API_BEARER_AUTH: "1"
+      DATAVERSE_FEATURE_INDEX_HARVESTED_METADATA_SOURCE: "1"
       DATAVERSE_FEATURE_API_BEARER_AUTH_PROVIDE_MISSING_CLAIMS: "1"
       DATAVERSE_MAIL_SYSTEM_EMAIL: "dataverse@localhost"
       DATAVERSE_MAIL_MTA_HOST: "smtp"