Skip to content

Commit

Permalink
Merge branch 'develop' into 9507-show-linked-collections
Browse files Browse the repository at this point in the history
  • Loading branch information
sekmiller committed Oct 16, 2023
2 parents cf807cc + 3665752 commit 15ac60e
Show file tree
Hide file tree
Showing 40 changed files with 1,354 additions and 308 deletions.
12 changes: 12 additions & 0 deletions doc/release-notes/9852-files-api-extension-deaccession.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Extended the existing endpoints:

- getVersionFiles (/api/datasets/{id}/versions/{versionId}/files)
- getVersionFileCounts (/api/datasets/{id}/versions/{versionId}/files/counts)

The above endpoints now accept a new boolean optional query parameter "includeDeaccessioned", which, if enabled, causes the endpoint to consider deaccessioned versions when searching for versions to obtain files or file counts.

Additionally, a new endpoint has been developed to support version deaccessioning through API (Given a dataset and a version).

- deaccessionDataset (/api/datasets/{id}/versions/{versionId}/deaccession)

Finally, the DataFile API payload has been extended to add the field "friendlyType"
11 changes: 11 additions & 0 deletions doc/release-notes/9907-files-api-counts-with-criteria.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Extended the getVersionFileCounts endpoint (/api/datasets/{id}/versions/{versionId}/files/counts) to support filtering by criteria.

In particular, the endpoint now accepts the following optional criteria query parameters:

- contentType
- accessStatus
- categoryName
- tabularTagName
- searchText

This filtering criteria is the same as the one for the getVersionFiles endpoint.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Added a new optional query parameter "mode" to the "getDownloadSize" API endpoint ("api/datasets/{identifier}/versions/{versionId}/downloadsize").

This parameter applies a filter criteria to the operation and supports the following values:

- All (Default): Includes both archival and original sizes for tabular files

- Archival: Includes only the archival size for tabular files

- Original: Includes only the original size for tabular files
119 changes: 118 additions & 1 deletion doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1034,7 +1034,17 @@ Usage example:
Please note that both filtering and ordering criteria values are case sensitive and must be correctly typed for the endpoint to recognize them.

Keep in mind that you can combine all of the above query params depending on the results you are looking for.
By default, deaccessioned dataset versions are not included in the search when applying the :latest or :latest-published identifiers. Additionally, when filtering by a specific version tag, you will get a "not found" error if the version is deaccessioned and you do not enable the ``includeDeaccessioned`` option described below.

If you want to include deaccessioned dataset versions, you must set ``includeDeaccessioned`` query parameter to ``true``.

Usage example:

.. code-block:: bash
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files?includeDeaccessioned=true"
.. note:: Keep in mind that you can combine all of the above query params depending on the results you are looking for.

Get File Counts in a Dataset
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -1046,6 +1056,7 @@ The returned file counts are based on different criteria:
- Total (The total file count)
- Per content type
- Per category name
- Per tabular tag name
- Per access status (Possible values: Public, Restricted, EmbargoedThenRestricted, EmbargoedThenPublic)

.. code-block:: bash
Expand All @@ -1062,6 +1073,67 @@ The fully expanded example above (without environment variables) looks like this
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files/counts"
Category name filtering is optionally supported. To return counts only for files to which the requested category has been added.

Usage example:

.. code-block:: bash
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files/counts?categoryName=Data"
Tabular tag name filtering is also optionally supported. To return counts only for files to which the requested tabular tag has been added.

Usage example:

.. code-block:: bash
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files/counts?tabularTagName=Survey"
Content type filtering is also optionally supported. To return counts only for files matching the requested content type.

Usage example:

.. code-block:: bash
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files/counts?contentType=image/png"
Filtering by search text is also optionally supported. The search will be applied to the labels and descriptions of the dataset files, to return counts only for files that contain the text searched in one of such fields.

Usage example:

.. code-block:: bash
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files/counts?searchText=word"
File access filtering is also optionally supported. In particular, by the following possible values:

* ``Public``
* ``Restricted``
* ``EmbargoedThenRestricted``
* ``EmbargoedThenPublic``

If no filter is specified, the files will match all of the above categories.

Usage example:

.. code-block:: bash
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files/counts?accessStatus=Public"
By default, deaccessioned dataset versions are not supported by this endpoint and will be ignored in the search when applying the :latest or :latest-published identifiers. Additionally, when filtering by a specific version tag, you will get a not found error if the version is deaccessioned and you do not enable the option described below.

If you want to include deaccessioned dataset versions, you must specify this through the ``includeDeaccessioned`` query parameter.

Usage example:

.. code-block:: bash
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files/counts?includeDeaccessioned=true"
Please note that filtering values are case sensitive and must be correctly typed for the endpoint to recognize them.

Keep in mind that you can combine all of the above query params depending on the results you are looking for.

View Dataset Files and Folders as a Directory Index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -1358,6 +1430,39 @@ The fully expanded example above (without environment variables) looks like this
curl -H "X-Dataverse-key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X DELETE "https://demo.dataverse.org/api/datasets/24/versions/:draft"
Deaccession Dataset
~~~~~~~~~~~~~~~~~~~

Given a version of a dataset, updates its status to deaccessioned.

The JSON body required to deaccession a dataset (``deaccession.json``) looks like this::

{
"deaccessionReason": "Description of the deaccession reason.",
"deaccessionForwardURL": "https://demo.dataverse.org"
}


Note that the field ``deaccessionForwardURL`` is optional.

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=24
export VERSIONID=1.0
export FILE_PATH=deaccession.json
curl -H "X-Dataverse-key:$API_TOKEN" -X POST "$SERVER_URL/api/datasets/$ID/versions/$VERSIONID/deaccession" -H "Content-type:application/json" --upload-file $FILE_PATH
The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST "https://demo.dataverse.org/api/datasets/24/versions/1.0/deaccession" -H "Content-type:application/json" --upload-file deaccession.json
.. note:: You cannot deaccession a dataset more than once. If you call this endpoint twice for the same dataset version, you will get a not found error on the second call, since the dataset you are looking for will no longer be published since it is already deaccessioned.

Set Citation Date Field Type for a Dataset
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -1771,6 +1876,18 @@ The fully expanded example above (without environment variables) looks like this
The size of all files available for download will be returned.
If :draft is passed as versionId the token supplied must have permission to view unpublished drafts. A token is not required for published datasets. Also restricted files will be included in this total regardless of whether the user has access to download the restricted file(s).

There is an optional query parameter ``mode`` which applies a filter criteria to the operation. This parameter supports the following values:

* ``All`` (Default): Includes both archival and original sizes for tabular files
* ``Archival``: Includes only the archival size for tabular files
* ``Original``: Includes only the original size for tabular files

Usage example:

.. code-block:: bash
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" "https://demo.dataverse.org/api/datasets/24/versions/1.0/downloadsize?mode=Archival"
Submit a Dataset for Review
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
51 changes: 40 additions & 11 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1276,6 +1276,8 @@ The list below depicts a set of tools that can be used to ease the amount of wor

- `easyTranslationHelper <https://github.com/universidadeaveiro/easyTranslationHelper>`_, a tool developed by `University of Aveiro <https://www.ua.pt/>`_.

- `Dataverse General User Interface Translation Guide for Weblate <https://doi.org/10.5281/zenodo.4807371>`_, a guide produced as part of the `SSHOC Dataverse Translation <https://www.sshopencloud.eu/news/workshop-notes-sshoc-dataverse-translation-follow-event/>`_ event.

.. _Web-Analytics-Code:

Web Analytics Code
Expand Down Expand Up @@ -1771,8 +1773,8 @@ protocol, host, and port number and should not include a trailing slash.
dataverse.files.directory
+++++++++++++++++++++++++

Please provide an absolute path to a directory backed by some mounted file system. This directory is used for a number
of purposes:
Providing an explicit location here makes it easier to reuse some mounted filesystem and we recommend doing so
to avoid filled up disks, aid in performance, etc. This directory is used for a number of purposes:

1. ``<dataverse.files.directory>/temp`` after uploading, data is temporarily stored here for ingest and/or before
shipping to the final storage destination.
Expand All @@ -1785,24 +1787,51 @@ of purposes:
under certain conditions. This directory may also be used by file stores for :ref:`permanent file storage <storage-files-dir>`,
but this is controlled by other, store-specific settings.

Defaults to ``/tmp/dataverse``. Can also be set via *MicroProfile Config API* sources, e.g. the environment variable
``DATAVERSE_FILES_DIRECTORY``. Defaults to ``${STORAGE_DIR}`` for profile ``ct``, important for the
:ref:`Dataverse Application Image <app-locations>`.
Notes:

- Please provide an absolute path to a directory backed by some mounted file system.
- Can also be set via *MicroProfile Config API* sources, e.g. the environment variable ``DATAVERSE_FILES_DIRECTORY``.
- Defaults to ``/tmp/dataverse`` in a :doc:`default installation <installation-main>`.
- Defaults to ``${STORAGE_DIR}`` using our :ref:`Dataverse container <app-locations>` (resolving to ``/dv``).
- During startup, this directory will be checked for existence and write access. It will be created for you
if missing. If it cannot be created or does not have proper write access, application deployment will fail.

.. _dataverse.files.uploads:

dataverse.files.uploads
+++++++++++++++++++++++

Configure a folder to store the incoming file stream during uploads (before transfering to `${dataverse.files.directory}/temp`).
Configure a folder to store the incoming file stream during uploads (before transfering to ``${dataverse.files.directory}/temp``).
Providing an explicit location here makes it easier to reuse some mounted filesystem.
Please also see :ref:`temporary-file-storage` for more details.
You can use an absolute path or a relative, which is relative to the application server domain directory.

Defaults to ``./uploads``, which resolves to ``/usr/local/payara6/glassfish/domains/domain1/uploads`` in a default
installation.
Notes:

- Please provide an absolute path to a directory backed by some mounted file system.
- Defaults to ``${com.sun.aas.instanceRoot}/uploads`` in a :doc:`default installation <installation-main>`
(resolving to ``/usr/local/payara6/glassfish/domains/domain1/uploads``).
- Defaults to ``${STORAGE_DIR}/uploads`` using our :ref:`Dataverse container <app-locations>` (resolving to ``/dv/uploads``).
- Can also be set via *MicroProfile Config API* sources, e.g. the environment variable ``DATAVERSE_FILES_UPLOADS``.
- During startup, this directory will be checked for existence and write access. It will be created for you
if missing. If it cannot be created or does not have proper write access, application deployment will fail.

.. _dataverse.files.docroot:

dataverse.files.docroot
+++++++++++++++++++++++

Configure a folder to store and retrieve additional materials like user uploaded collection logos, generated sitemaps,
and so on. Providing an explicit location here makes it easier to reuse some mounted filesystem.
See also logo customization above.

Notes:

Can also be set via *MicroProfile Config API* sources, e.g. the environment variable ``DATAVERSE_FILES_UPLOADS``.
Defaults to ``${STORAGE_DIR}/uploads`` for profile ``ct``, important for the :ref:`Dataverse Application Image <app-locations>`.
- Defaults to ``${com.sun.aas.instanceRoot}/docroot`` in a :doc:`default installation <installation-main>`
(resolves to ``/usr/local/payara6/glassfish/domains/domain1/docroot``).
- Defaults to ``${STORAGE_DIR}/docroot`` using our :ref:`Dataverse container <app-locations>` (resolving to ``/dv/docroot``).
- Can also be set via *MicroProfile Config API* sources, e.g. the environment variable ``DATAVERSE_FILES_DOCROOT``.
- During startup, this directory will be checked for existence and write access. It will be created for you
if missing. If it cannot be created or does not have proper write access, application deployment will fail.

dataverse.auth.password-reset-timeout-in-minutes
++++++++++++++++++++++++++++++++++++++++++++++++
Expand Down
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/user/dataverse-management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ Dataset linking allows a Dataverse collection owner to "link" their Dataverse co

For example, researchers working on a collaborative study across institutions can each link their own individual institutional Dataverse collections to the one collaborative dataset, making it easier for interested parties from each institution to find the study.

In order to link a dataset, you will need your account to have the "Add Dataset" permission on the Dataverse collection that is doing the linking. If you created the Dataverse collection then you should have this permission already, but if not then you will need to ask the admin of that Dataverse collection to assign that permission to your account. You do not need any special permissions on the dataset being linked.
In order to link a dataset, you will need your account to have the "Publish Dataset" permission on the Dataverse collection that is doing the linking. If you created the Dataverse collection then you should have this permission already, but if not then you will need to ask the admin of that Dataverse collection to assign that permission to your account. You do not need any special permissions on the dataset being linked.

To link a dataset to your Dataverse collection, you must navigate to that dataset and click the white "Link" button in the upper-right corner of the dataset page. This will open up a window where you can type in the name of the Dataverse collection that you would like to link the dataset to. Select your Dataverse collection and click the save button. This will establish the link, and the dataset will now appear under your Dataverse collection.

Expand Down
12 changes: 12 additions & 0 deletions docker-compose-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ services:
depends_on:
- dev_postgres
- dev_solr
- dev_dv_initializer
volumes:
- ./docker-dev-volumes/app/data:/dv
- ./docker-dev-volumes/app/secrets:/secrets
Expand All @@ -52,6 +53,17 @@ services:
networks:
- dataverse

dev_dv_initializer:
container_name: "dev_dv_initializer"
image: gdcc/configbaker:unstable
restart: "no"
command:
- sh
- -c
- "fix-fs-perms.sh dv"
volumes:
- ./docker-dev-volumes/app/data:/dv

dev_postgres:
container_name: "dev_postgres"
hostname: postgres
Expand Down
5 changes: 5 additions & 0 deletions src/main/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,11 @@ FROM $BASE_IMAGE
# See also https://download.eclipse.org/microprofile/microprofile-config-3.0/microprofile-config-spec-3.0.html#configprofile
ENV MP_CONFIG_PROFILE=ct

# Workaround to configure upload directories by default to useful place until we can have variable lookups in
# defaults for glassfish-web.xml and other places.
ENV DATAVERSE_FILES_UPLOADS="${STORAGE_DIR}/uploads"
ENV DATAVERSE_FILES_DOCROOT="${STORAGE_DIR}/docroot"

# Copy app and deps from assembly in proper layers
COPY --chown=payara:payara maven/deps ${DEPLOY_DIR}/dataverse/WEB-INF/lib/
COPY --chown=payara:payara maven/app ${DEPLOY_DIR}/dataverse/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -788,13 +788,13 @@ public void exportDataset(Dataset dataset, boolean forceReExport) {
}
}
}

}

//get a string to add to save success message
//depends on page (dataset/file) and user privleges
public String getReminderString(Dataset dataset, boolean canPublishDataset, boolean filePage, boolean isValid) {

String reminderString;

if (canPublishDataset) {
Expand Down
Loading

0 comments on commit 15ac60e

Please sign in to comment.