Skip to content

Commit

Permalink
Merge branch 'develop' into 8290-harvest-client-api
Browse files Browse the repository at this point in the history
  • Loading branch information
landreev committed Dec 1, 2022
2 parents 8794c07 + bf2e426 commit 2710739
Show file tree
Hide file tree
Showing 42 changed files with 1,278 additions and 147 deletions.
8 changes: 8 additions & 0 deletions conf/solr/8.11.1/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,11 @@

<field name="dsPersistentId" type="text_en" multiValued="false" stored="true" indexed="true"/>
<field name="filePersistentId" type="text_en" multiValued="false" stored="true" indexed="true"/>
<!-- Dataverse geospatial search -->
<!-- https://solr.apache.org/guide/8_11/spatial-search.html#rpt -->
<field name="geolocation" type="location_rpt" multiValued="true" stored="true" indexed="true"/>
<!-- https://solr.apache.org/guide/8_11/spatial-search.html#bboxfield -->
<field name="boundingBox" type="bbox" multiValued="true" stored="true" indexed="true"/>

<!--
METADATA SCHEMA FIELDS
Expand Down Expand Up @@ -1104,6 +1109,9 @@
-->
<fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType"
geo="true" distErrPct="0.025" maxDistErr="0.001" distanceUnits="kilometers" />
<!-- Dataverse - per GeoBlacklight, adding field type for bboxField that enables, among other things, overlap ratio calculations -->
<fieldType name="bbox" class="solr.BBoxField"
geo="true" distanceUnits="kilometers" numberType="pdouble" />

<!-- Payloaded field types -->
<fieldType name="delimited_payloads_float" stored="false" indexed="true" class="solr.TextField">
Expand Down
3 changes: 3 additions & 0 deletions doc/release-notes/7715-signed-urls-for-external-tools.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Improved Security for External Tools

This release adds support for configuring external tools to use signed URLs to access the Dataverse API. This eliminates the need for tools to have access to the user's apiToken in order to access draft or restricted datasets and datafiles. Signed URLS can be transferred via POST or via a callback when triggering a tool via GET.
5 changes: 5 additions & 0 deletions doc/release-notes/8239-geospatial-indexing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Support for indexing the "Geographic Bounding Box" fields ("West Longitude", "East Longitude", "North Latitude", and "South Latitude") from the Geospatial metadata block has been added.

Geospatial search is supported but only via API using two new parameters: `geo_point` and `geo_radius`.

A Solr schema update is required.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Tool Type Scope Description
Data Explorer explore file A GUI which lists the variables in a tabular data file allowing searching, charting and cross tabulation analysis. See the README.md file at https://github.com/scholarsportal/dataverse-data-explorer-v2 for the instructions on adding Data Explorer to your Dataverse.
Whole Tale explore dataset A platform for the creation of reproducible research packages that allows users to launch containerized interactive analysis environments based on popular tools such as Jupyter and RStudio. Using this integration, Dataverse users can launch Jupyter and RStudio environments to analyze published datasets. For more information, see the `Whole Tale User Guide <https://wholetale.readthedocs.io/en/stable/users_guide/integration.html>`_.
File Previewers explore file A set of tools that display the content of files - including audio, html, `Hypothes.is <https://hypothes.is/>`_ annotations, images, PDF, text, video, tabular data, spreadsheets, and GeoJSON - allowing them to be viewed without downloading. The previewers can be run directly from github.io, so the only required step is using the Dataverse API to register the ones you want to use. Documentation, including how to optionally brand the previewers, and an invitation to contribute through github are in the README.md file. Initial development was led by the Qualitative Data Repository and the spreasdheet previewer was added by the Social Sciences and Humanities Open Cloud (SSHOC) project. https://github.com/gdcc/dataverse-previewers
File Previewers explore file A set of tools that display the content of files - including audio, html, `Hypothes.is <https://hypothes.is/>`_ annotations, images, PDF, text, video, tabular data, spreadsheets, GeoJSON, and ZipFiles - allowing them to be viewed without downloading the file. The previewers can be run directly from github.io, so the only required step is using the Dataverse API to register the ones you want to use. Documentation, including how to optionally brand the previewers, and an invitation to contribute through github are in the README.md file. Initial development was led by the Qualitative Data Repository and the spreasdheet previewer was added by the Social Sciences and Humanities Open Cloud (SSHOC) project. https://github.com/gdcc/dataverse-previewers
Data Curation Tool configure file A GUI for curating data by adding labels, groups, weights and other details to assist with informed reuse. See the README.md file at https://github.com/scholarsportal/Dataverse-Data-Curation-Tool for the installation instructions.
8 changes: 4 additions & 4 deletions doc/sphinx-guides/source/_static/api/ddi_dataset.xml
Original file line number Diff line number Diff line change
Expand Up @@ -88,12 +88,12 @@
<geoBndBox>
<westBL>10</westBL>
<eastBL>20</eastBL>
<northBL>30</northBL>
<southBL>40</southBL>
<northBL>40</northBL>
<southBL>30</southBL>
</geoBndBox>
<geoBndBox>
<southBL>80</southBL>
<northBL>70</northBL>
<southBL>70</southBL>
<northBL>80</northBL>
<eastBL>60</eastBL>
<westBL>50</westBL>
</geoBndBox>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,16 @@
"PID": "{datasetPid}"
},
{
"apiToken": "{apiToken}"
"locale":"{localeCode}"
}
]
],
"allowedApiCalls": [
{
"name":"retrieveDatasetJson",
"httpMethod":"GET",
"urlTemplate":"/api/v1/datasets/{datasetId}",
"timeOut":10
}
]
}
}
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"displayName": "Fabulous File Tool",
"description": "Fabulous Fun for Files!",
"description": "A non-existent tool that is fabulous fun for files!",
"toolName": "fabulous",
"scope": "file",
"types": [
Expand All @@ -9,13 +9,25 @@
],
"toolUrl": "https://fabulousfiletool.com",
"contentType": "text/tab-separated-values",
"httpMethod":"GET",
"toolParameters": {
"queryParameters": [
{
"fileid": "{fileId}"
},
{
"key": "{apiToken}"
"datasetPid": "{datasetPid}"
},
{
"locale":"{localeCode}"
}
],
"allowedApiCalls": [
{
"name":"retrieveDataFile",
"httpMethod":"GET",
"urlTemplate":"/api/v1/access/datafile/{fileId}",
"timeOut":270
}
]
}
Expand Down
33 changes: 32 additions & 1 deletion doc/sphinx-guides/source/api/external-tools.rst
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,9 @@ Terminology

contentType File level tools operate on a specific **file type** (content type or MIME type such as "application/pdf") and this must be specified. Dataset level tools do not use contentType.

toolParameters **Query parameters** are supported and described below.
toolParameters **httpMethod**, **queryParameters**, and **allowedApiCalls** are supported and described below.

httpMethod Either ``GET`` or ``POST``.

queryParameters **Key/value combinations** that can be appended to the toolUrl. For example, once substitution takes place (described below) the user may be redirected to ``https://fabulousfiletool.com?fileId=42&siteUrl=http://demo.dataverse.org``.

Expand All @@ -102,6 +104,16 @@ Terminology

reserved words A **set of strings surrounded by curly braces** such as ``{fileId}`` or ``{datasetId}`` that will be inserted into query parameters. See the table below for a complete list.

allowedApiCalls An array of objects defining callbacks the tool is allowed to make to the Dataverse API. If the dataset or file being accessed is not public, the callback URLs will be signed to allow the tool access for a defined time.

allowedApiCalls name A name the tool will use to identify this callback URL such as ``retrieveDataFile``.

allowedApiCalls urlTemplate The relative URL for the callback using reserved words to indicate where values should by dynamically substituted such as ``/api/v1/datasets/{datasetId}``.

allowedApiCalls httpMethod Which HTTP method the specified callback uses such as ``GET`` or ``POST``.

allowedApiCalls timeOut For non-public datasets and datafiles, how many minutes the signed URLs given to the tool should be valid for. Must be an integer.

toolName A **name** of an external tool that is used to differentiate between external tools and also used in bundle.properties for localization in the Dataverse installation web interface. For example, the toolName for Data Explorer is ``explorer``. For the Data Curation Tool the toolName is ``dct``. This is an optional parameter in the manifest JSON file.
=========================== ==========

Expand Down Expand Up @@ -131,6 +143,25 @@ Reserved Words
``{localeCode}`` optional The code for the language ("en" for English, "fr" for French, etc.) that user has selected from the language toggle in a Dataverse installation. See also :ref:`i18n`.
=========================== ========== ===========

.. _api-exttools-auth:

Authorization Options
+++++++++++++++++++++

When called for datasets or data files that are not public (i.e. in a draft dataset or for a restricted file), external tools are allowed access via the user's credentials. This is accomplished by one of two mechanisms:

* Signed URLs (more secure, recommended)

- Configured via the ``allowedApiCalls`` section of the manifest. The tool will be provided with signed URLs allowing the specified access to the given dataset or datafile for the specified amount of time. The tool will not be able to access any other datasets or files the user may have access to and will not be able to make calls other than those specified.
- For tools invoked via a GET call, Dataverse will include a callback query parameter with a Base64 encoded value. The decoded value is a signed URL that can be called to retrieve a JSON response containing all of the queryParameters and allowedApiCalls specified in the manfiest.
- For tools invoked via POST, Dataverse will send a JSON body including the requested queryParameters and allowedApiCalls. Dataverse expects the response to the POST to indicate a redirect which Dataverse will use to open the tool.

* API Token (deprecated, less secure, not recommended)

- Configured via the ``queryParameters`` by including an ``{apiToken}`` value. When this is present Dataverse will send the user's apiToken to the tool. With the user's API token, the tool can perform any action via the Dataverse API that the user could. External tools configured via this method should be assessed for their trustworthiness.
- For tools invoked via GET, this will be done via a query parameter in the request URL which could be cached in the browser's history. Dataverse expects the response to the POST to indicate a redirect which Dataverse will use to open the tool.
- For tools invoked via POST, Dataverse will send a JSON body including the apiToken.

Internationalization of Your External Tool
++++++++++++++++++++++++++++++++++++++++++

Expand Down
67 changes: 65 additions & 2 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2029,6 +2029,24 @@ Archiving is an optional feature that may be configured for a Dataverse installa
curl -H "X-Dataverse-key: $API_TOKEN" -X DELETE "$SERVER_URL/api/datasets/:persistentId/$VERSION/archivalStatus?persistentId=$PERSISTENT_IDENTIFIER"
Get External Tool Parameters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This API call is intended as a callback that can be used by :doc:`/installation/external-tools` to retrieve signed Urls necessary for their interaction with Dataverse.
It can be called directly as well.

The response is a JSON object described in the :doc:`/api/external-tools` section of the API guide.

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/7U7YBV
export VERSION=1.0
export TOOL_ID=1
curl -H "X-Dataverse-key: $API_TOKEN" -H "Accept:application/json" "$SERVER_URL/api/datasets/:persistentId/versions/$VERSION/toolparams/$TOOL_ID?persistentId=$PERSISTENT_IDENTIFIER"
Files
-----
Expand Down Expand Up @@ -2689,6 +2707,24 @@ Note the optional "limit" parameter. Without it, the API will attempt to populat

By default, the admin API calls are blocked and can only be called from localhost. See more details in :ref:`:BlockedApiEndpoints <:BlockedApiEndpoints>` and :ref:`:BlockedApiPolicy <:BlockedApiPolicy>` settings in :doc:`/installation/config`.

Get External Tool Parameters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This API call is intended as a callback that can be used by :doc:`/installation/external-tools` to retrieve signed Urls necessary for their interaction with Dataverse.
It can be called directly as well. (Note that the required FILEMETADATA_ID is the "id" returned in the JSON response from the /api/files/$FILE_ID/metadata call.)

The response is a JSON object described in the :doc:`/api/external-tools` section of the API guide.

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export FILE_ID=3
export FILEMETADATA_ID=1
export TOOL_ID=1
curl -H "X-Dataverse-key: $API_TOKEN" -H "Accept:application/json" "$SERVER_URL/api/files/$FILE_ID/metadata/$FILEMETADATA_ID/toolparams/$TOOL_ID
Users Token Management
----------------------
Expand Down Expand Up @@ -4218,6 +4254,33 @@ The fully expanded example above (without environment variables) looks like this
.. code-block:: bash
curl -X DELETE https://demo.dataverse.org/api/admin/template/24
.. _api-native-signed-url:

Request Signed URL
~~~~~~~~~~~~~~~~~~

Dataverse has the ability to create signed URLs for it's API calls.
A signature, which is valid only for the specific API call and only for a specified duration, allows the call to proceed with the authentication of the specified user.
It is intended as an alternative to the use of an API key (which is valid for a long time period and can be used with any API call).
Signed URLs were developed to support External Tools but may be useful in other scenarios where Dataverse or a third-party tool needs to delegate limited access to another user or tool.
This API call allows a Dataverse superUser to generate a signed URL for such scenarios.
The JSON input parameter required is an object with the following keys:

- ``url`` - the exact URL to sign, including api version number and all query parameters
- ``timeOut`` - how long in minutes the signature should be valid for, default is 10 minutes
- ``httpMethod`` - which HTTP method is required, default is GET
- ``user`` - the user identifier for the account associated with this signature, the default is the superuser making the call. The API call will succeed/fail based on whether the specified user has the required permissions.

A curl example using allowing access to a dataset's metadata

.. code-block:: bash
export SERVER_URL=https://demo.dataverse.org
export API_KEY=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export JSON='{"url":"https://demo.dataverse.org/api/v1/datasets/:persistentId/?persistentId=doi:10.5072/FK2/J8SJZB","timeOut":5,"user":"alberteinstein"}'
curl -H "X-Dataverse-key:$API_KEY" -H 'Content-Type:application/json' -d "$JSON" $SERVER_URL/api/admin/requestSignedUrl
Please see :ref:`dataverse.api.signature-secret` for the configuration option to add a shared secret, enabling extra
security.
2 changes: 2 additions & 0 deletions doc/sphinx-guides/source/api/search.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ show_relevance boolean Whether or not to show details of which fields were ma
show_facets boolean Whether or not to show facets that can be operated on by the "fq" parameter. False by default. See :ref:`advanced search example <advancedsearch-example>`.
fq string A filter query on the search term. Multiple "fq" parameters can be used. See :ref:`advanced search example <advancedsearch-example>`.
show_entity_ids boolean Whether or not to show the database IDs of the search results (for developer use).
geo_point string Latitude and longitude in the form ``geo_point=42.3,-71.1``. You must supply ``geo_radius`` as well. See also :ref:`geospatial-search`.
geo_radius string Radial distance in kilometers from ``geo_point`` (which must be supplied as well) such as ``geo_radius=1.5``.
metadata_fields string Includes the requested fields for each dataset in the response. Multiple "metadata_fields" parameters can be used to include several fields. The value must be in the form "{metadata_block_name}:{field_name}" to include a specific field from a metadata block (see :ref:`example <dynamic-citation-some>`) or "{metadata_field_set_name}:\*" to include all the fields for a metadata block (see :ref:`example <dynamic-citation-all>`). "{field_name}" cannot be a subfield of a compound field. If "{field_name}" is a compound field, all subfields are included.
=============== ======= ===========

Expand Down
40 changes: 38 additions & 2 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -580,8 +580,7 @@ Optionally, you may provide static credentials for each S3 storage using MicroPr
- ``dataverse.files.<id>.access-key`` for this storage's "access key ID"
- ``dataverse.files.<id>.secret-key`` for this storage's "secret access key"

You may provide the values for these via any of the
`supported config sources <https://docs.payara.fish/community/docs/documentation/microprofile/config/README.html>`_.
You may provide the values for these via any `supported MicroProfile Config API source`_.

**WARNING:**

Expand Down Expand Up @@ -1693,6 +1692,39 @@ This setting is useful in cases such as running your Dataverse installation behi
"HTTP_VIA",
"REMOTE_ADDR"
.. _dataverse.api.signature-secret:

dataverse.api.signature-secret
++++++++++++++++++++++++++++++

Context: Dataverse has the ability to create "Signed URLs" for it's API calls. Using a signed URLs is more secure than
providing API tokens, which are long-lived and give the holder all of the permissions of the user. In contrast, signed URLs
are time limited and only allow the action of the API call in the URL. See :ref:`api-exttools-auth` and
:ref:`api-native-signed-url` for more details.

The key used to sign a URL is created from the API token of the creating user plus a signature-secret provided by an administrator.
**Using a signature-secret is highly recommended.** This setting defaults to an empty string. Using a non-empty
signature-secret makes it impossible for someone who knows an API token from forging signed URLs and provides extra security by
making the overall signing key longer.

Since the signature-secret is sensitive, you should treat it like a password. Here is an example how to set your shared secret
with the secure method "password alias":

.. code-block:: shell
echo "AS_ADMIN_ALIASPASSWORD=change-me-super-secret" > /tmp/password.txt
asadmin create-password-alias --passwordfile /tmp/password.txt dataverse.api.signature-secret
rm /tmp/password.txt
Can also be set via any `supported MicroProfile Config API source`_, e.g. the environment variable
``DATAVERSE_API_SIGNATURE_SECRET``.

**WARNING:** For security, do not use the sources "environment variable" or "system property" (JVM option) in a
production context! Rely on password alias, secrets directory or cloud based sources instead!



.. _:ApplicationServerSettings:

Application Server Settings
Expand Down Expand Up @@ -3090,3 +3122,7 @@ The interval in seconds between Dataverse calls to Globus to check on upload pro
+++++++++++++++++++++++++

A true/false option to add a Globus transfer option to the file download menu which is not yet fully supported in the dataverse-globus app. See :ref:`globus-support` for details.



.. _supported MicroProfile Config API source: https://docs.payara.fish/community/docs/Technical%20Documentation/MicroProfile/Config/Overview.html
Loading

0 comments on commit 2710739

Please sign in to comment.