Skip to content

Commit

Permalink
Merge branch 'develop' into 8243-improve-language-controlled-vocab
Browse files Browse the repository at this point in the history
  • Loading branch information
landreev committed May 15, 2024
2 parents a84d577 + da3dd95 commit 93fb1f5
Show file tree
Hide file tree
Showing 126 changed files with 3,611 additions and 799 deletions.
101 changes: 101 additions & 0 deletions .github/workflows/maven_cache_management.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
name: Maven Cache Management

on:
# Every push to develop should trigger cache rejuvenation (dependencies might have changed)
push:
branches:
- develop
# According to https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#usage-limits-and-eviction-policy
# all caches are deleted after 7 days of no access. Make sure we rejuvenate every 7 days to keep it available.
schedule:
- cron: '23 2 * * 0' # Run for 'develop' every Sunday at 02:23 UTC (3:23 CET, 21:23 ET)
# Enable manual cache management
workflow_dispatch:
# Delete branch caches once a PR is merged
pull_request:
types:
- closed

env:
COMMON_CACHE_KEY: "dataverse-maven-cache"
COMMON_CACHE_PATH: "~/.m2/repository"

jobs:
seed:
name: Drop and Re-Seed Local Repository
runs-on: ubuntu-latest
if: ${{ github.event_name != 'pull_request' }}
permissions:
# Write permission needed to delete caches
# See also: https://docs.github.com/en/rest/actions/cache?apiVersion=2022-11-28#delete-a-github-actions-cache-for-a-repository-using-a-cache-id
actions: write
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Determine Java version from Parent POM
run: echo "JAVA_VERSION=$(grep '<target.java.version>' modules/dataverse-parent/pom.xml | cut -f2 -d'>' | cut -f1 -d'<')" >> ${GITHUB_ENV}
- name: Set up JDK ${{ env.JAVA_VERSION }}
uses: actions/setup-java@v4
with:
java-version: ${{ env.JAVA_VERSION }}
distribution: temurin
- name: Seed common cache
run: |
mvn -B -f modules/dataverse-parent dependency:go-offline dependency:resolve-plugins
# This non-obvious order is due to the fact that the download via Maven above will take a very long time (7-8 min).
# Jobs should not be left without a cache. Deleting and saving in one go leaves only a small chance for a cache miss.
- name: Drop common cache
run: |
gh extension install actions/gh-actions-cache
echo "🛒 Fetching list of cache keys"
cacheKeys=$(gh actions-cache list -R ${{ github.repository }} -B develop | cut -f 1 )
## Setting this to not fail the workflow while deleting cache keys.
set +e
echo "🗑️ Deleting caches..."
for cacheKey in $cacheKeys
do
gh actions-cache delete $cacheKey -R ${{ github.repository }} -B develop --confirm
done
echo "✅ Done"
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Save the common cache
uses: actions/cache@v4
with:
path: ${{ env.COMMON_CACHE_PATH }}
key: ${{ env.COMMON_CACHE_KEY }}
enableCrossOsArchive: true

# Let's delete feature branch caches once their PR is merged - we only have 10 GB of space before eviction kicks in
deplete:
name: Deplete feature branch caches
runs-on: ubuntu-latest
if: ${{ github.event_name == 'pull_request' }}
permissions:
# `actions:write` permission is required to delete caches
# See also: https://docs.github.com/en/rest/actions/cache?apiVersion=2022-11-28#delete-a-github-actions-cache-for-a-repository-using-a-cache-id
actions: write
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Cleanup caches
run: |
gh extension install actions/gh-actions-cache
BRANCH=refs/pull/${{ github.event.pull_request.number }}/merge
echo "🛒 Fetching list of cache keys"
cacheKeysForPR=$(gh actions-cache list -R ${{ github.repository }} -B $BRANCH | cut -f 1 )
## Setting this to not fail the workflow while deleting cache keys.
set +e
echo "🗑️ Deleting caches..."
for cacheKey in $cacheKeysForPR
do
gh actions-cache delete $cacheKey -R ${{ github.repository }} -B $BRANCH --confirm
done
echo "✅ Done"
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
3 changes: 2 additions & 1 deletion conf/solr/9.3.0/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,8 @@
<field name="publicationStatus" type="string" stored="true" indexed="true" multiValued="true"/>
<field name="externalStatus" type="string" stored="true" indexed="true" multiValued="false"/>
<field name="embargoEndDate" type="plong" stored="true" indexed="true" multiValued="false"/>

<field name="retentionEndDate" type="plong" stored="true" indexed="true" multiValued="false"/>

<field name="subtreePaths" type="string" stored="true" indexed="true" multiValued="true"/>

<field name="fileName" type="text_en" stored="true" indexed="true" multiValued="true"/>
Expand Down
10 changes: 10 additions & 0 deletions doc/release-notes/10015-RO-Crate-metadata-file.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Detection of mime-types based on a filename with extension and detection of the RO-Crate metadata files.

From now on, filenames with extensions can be added into `MimeTypeDetectionByFileName.properties` file. Filenames added there will take precedence over simply recognizing files by extensions. For example, two new filenames are added into that file:
```
ro-crate-metadata.json=application/ld+json; profile="http://www.w3.org/ns/json-ld#flattened http://www.w3.org/ns/json-ld#compacted https://w3id.org/ro/crate"
ro-crate-metadata.jsonld=application/ld+json; profile="http://www.w3.org/ns/json-ld#flattened http://www.w3.org/ns/json-ld#compacted https://w3id.org/ro/crate"
```

Therefore, files named `ro-crate-metadata.json` will be then detected as RO-Crated metadata files from now on, instead as generic `JSON` files.
For more information on the RO-Crate specifications, see https://www.researchobject.org/ro-crate
5 changes: 5 additions & 0 deletions doc/release-notes/10022_upload_redirect_without_tagging.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
If your S3 store does not support tagging and gives an error if you configure direct uploads, you can disable the tagging by using the ``dataverse.files.<id>.disable-tagging`` JVM option. For more details see https://dataverse-guide--10029.org.readthedocs.build/en/10029/developers/big-data-support.html#s3-tags #10022 and #10029.

## New config options

- dataverse.files.<id>.disable-tagging
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Bug fixed for the ``incomplete metadata`` label being shown for published dataset with incomplete metadata in certain scenarios. This label will now be shown for draft versions of such datasets and published datasets that the user can edit. This label can also be made invisible for published datasets (regardless of edit rights) with the new option ``dataverse.ui.show-validity-label-when-published`` set to `false`.
1 change: 1 addition & 0 deletions doc/release-notes/10242-add-feature-dv-api
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
New api endpoints have been added to allow you to add or remove featured collections from a dataverse collection.
5 changes: 5 additions & 0 deletions doc/release-notes/10316_cvoc_http_headers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
You are now able to add HTTP request headers required by the External Vocabulary Services you are implementing.

A combined documentation can be found on pull request [#10404](https://github.com/IQSS/dataverse/pull/10404).

For more information, see issue [#10316](https://github.com/IQSS/dataverse/issues/10316) and pull request [gddc/dataverse-external-vocab-support#19](https://github.com/gdcc/dataverse-external-vocab-support/pull/19).
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
For scenarios involving API calls related to large datasets (Numerous files, for example: ~10k) it has been optimized:

- The search API endpoint.
- The permission checking logic present in PermissionServiceBean.
3 changes: 3 additions & 0 deletions doc/release-notes/10425-add-MIT-License.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
A new file has been added to import the MIT License to Dataverse: licenseMIT.json.

Documentation has been added to explain the procedure for adding new licenses to the guides.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Changed ``api/dataverses/{id}/metadatablocks`` so that setting the query parameter ``onlyDisplayedOnCreate=true`` also returns metadata blocks with dataset field type input levels configured as required on the General Information page of the collection, in addition to the metadata blocks and their fields with the property ``displayOnCreate=true`` (which was the original behavior).

A new endpoint ``api/dataverses/{id}/inputLevels`` has been created for updating the dataset field type input levels of a collection via API.
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
The Dataverse object returned by /api/dataverses has been extended to include "isReleased": {boolean}.
```javascript
{
"status": "OK",
"data": {
"id": 32,
"alias": "dv6f645bb5",
"name": "dv6f645bb5",
"dataverseContacts": [
{
"displayOrder": 0,
"contactEmail": "[email protected]"
}
],
"permissionRoot": true,
"dataverseType": "UNCATEGORIZED",
"ownerId": 1,
"creationDate": "2024-04-12T18:05:59Z",
"isReleased": true
}
}
```
12 changes: 8 additions & 4 deletions doc/release-notes/6.2-release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -417,12 +417,16 @@ In the following commands we assume that Payara 6 is installed in `/usr/local/pa

As noted above, deployment of the war file might take several minutes due a database migration script required for the new storage quotas feature.

6\. Restart Payara
6\. For installations with internationalization:

- Please remember to update translations via [Dataverse language packs](https://github.com/GlobalDataverseCommunityConsortium/dataverse-language-packs).

7\. Restart Payara

- `service payara stop`
- `service payara start`

7\. Update the following Metadata Blocks to reflect the incremental improvements made to the handling of core metadata fields:
8\. Update the following Metadata Blocks to reflect the incremental improvements made to the handling of core metadata fields:

```
wget https://github.com/IQSS/dataverse/releases/download/v6.2/geospatial.tsv
Expand All @@ -442,7 +446,7 @@ wget https://github.com/IQSS/dataverse/releases/download/v6.2/biomedical.tsv
curl http://localhost:8080/api/admin/datasetfield/load -H "Content-type: text/tab-separated-values" -X POST --upload-file scripts/api/data/metadatablocks/biomedical.tsv
```

8\. For installations with custom or experimental metadata blocks:
9\. For installations with custom or experimental metadata blocks:

- Stop Solr instance (usually `service solr stop`, depending on Solr installation/OS, see the [Installation Guide](https://guides.dataverse.org/en/6.2/installation/prerequisites.html#solr-init-script))

Expand All @@ -455,7 +459,7 @@ curl http://localhost:8080/api/admin/datasetfield/load -H "Content-type: text/ta
- Restart Solr instance (usually `service solr restart` depending on solr/OS)

9\. Reindex Solr:
10\. Reindex Solr:

For details, see https://guides.dataverse.org/en/6.2/admin/solr-search-index.html but here is the reindex command:

Expand Down
12 changes: 12 additions & 0 deletions doc/release-notes/8655-re-add-cell-counting-biomedical-tsv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
## Release Highlights

### Life Science Metadata

Re-adding value `cell counting` to Life Science metadatablock's Measurement Type vocabularies accidentally removed in `v5.1`.

## Upgrade Instructions

### Update the Life Science metadata block

- `wget https://github.com/IQSS/dataverse/releases/download/v6.3/biomedical.tsv`
- `curl http://localhost:8080/api/admin/datasetfield/load -X POST --data-binary @biomedical.tsv -H "Content-type: text/tab-separated-values"`
11 changes: 11 additions & 0 deletions doc/release-notes/8936-more-than-50000-entries-in-sitemap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Dataverse can now handle more than 50,000 items when generating sitemap files, splitting the content across multiple files to comply with the Sitemap protocol.

For details see https://dataverse-guide--10321.org.readthedocs.build/en/10321/installation/config.html#creating-a-sitemap-and-submitting-it-to-search-engines #8936 and #10321.

## Upgrade instructions

If your installation has more than 50,000 entries, you should re-submit your sitemap URL to Google or other search engines. The file in the URL will change from ``sitemap.xml`` to ``sitemap_index.xml``.

As explained at https://dataverse-guide--10321.org.readthedocs.build/en/10321/installation/config.html#creating-a-sitemap-and-submitting-it-to-search-engines this is the command for regenerating your sitemap:

`curl -X POST http://localhost:8080/api/admin/sitemap`
8 changes: 8 additions & 0 deletions doc/release-notes/9375-retention-period.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
The Dataverse Software now supports file-level retention periods. The ability to set retention periods, with a minimum duration (in months), can be configured by a Dataverse installation administrator. For more information, see the [Retention Periods section](https://guides.dataverse.org/en/6.3/user/dataset-management.html#retention-periods) of the Dataverse Software Guides.

- Users can configure a specific retention period, defined by an end date and a short reason, on a set of selected files or an individual file, by selecting the 'Retention Period' menu item and entering information in a popup dialog. Retention Periods can only be set, changed, or removed before a file has been published. After publication, only Dataverse installation administrators can make changes, using an API.

- After the retention period expires, files can not be previewed or downloaded (as if restricted, with no option to allow access requests). The file (landing) page and all the metadata remains available.


Release notes should mention that a Solr schema update is needed.
1 change: 1 addition & 0 deletions doc/release-notes/9887-new-superuser-status-endpoint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
The existing API endpoint for toggling the superuser status of a user has been deprecated in favor of a new API endpoint that allows you to explicitly and idempotently set the status as true or false. For details, see [the guides](https://dataverse-guide--10440.org.readthedocs.build/en/10440/api/native-api.html), #9887 and #10440.
6 changes: 4 additions & 2 deletions doc/sphinx-guides/source/admin/metadatacustomization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -552,6 +552,8 @@ Great care must be taken when reloading a metadata block. Matching is done on fi

The ability to reload metadata blocks means that SQL update scripts don't need to be written for these changes. See also the :doc:`/developers/sql-upgrade-scripts` section of the Developer Guide.

.. _using-external-vocabulary-services:

Using External Vocabulary Services
----------------------------------

Expand All @@ -577,9 +579,9 @@ In general, the external vocabulary support mechanism may be a better choice for
The specifics of the user interface for entering/selecting a vocabulary term and how that term is then displayed are managed by third-party Javascripts. The initial Javascripts that have been created provide auto-completion, displaying a list of choices that match what the user has typed so far, but other interfaces, such as displaying a tree of options for a hierarchical vocabulary, are possible.
Similarly, existing scripts do relatively simple things for displaying a term - showing the term's name in the appropriate language and providing a link to an external URL with more information, but more sophisticated displays are possible.

Scripts supporting use of vocabularies from services supporting the SKOMOS protocol (see https://skosmos.org) and retrieving ORCIDs (from https://orcid.org) are available https://github.com/gdcc/dataverse-external-vocab-support. (Custom scripts can also be used and community members are encouraged to share new scripts through the dataverse-external-vocab-support repository.)
Scripts supporting use of vocabularies from services supporting the SKOMOS protocol (see https://skosmos.org), retrieving ORCIDs (from https://orcid.org), and using ROR (https://ror.org/) are available https://github.com/gdcc/dataverse-external-vocab-support. (Custom scripts can also be used and community members are encouraged to share new scripts through the dataverse-external-vocab-support repository.)

Configuration involves specifying which fields are to be mapped, whether free-text entries are allowed, which vocabulary(ies) should be used, what languages those vocabulary(ies) are available in, and several service protocol and service instance specific parameters.
Configuration involves specifying which fields are to be mapped, whether free-text entries are allowed, which vocabulary(ies) should be used, what languages those vocabulary(ies) are available in, and several service protocol and service instance specific parameters, including the ability to send HTTP headers on calls to the service.
These are all defined in the :ref:`:CVocConf <:CVocConf>` setting as a JSON array. Details about the required elements as well as example JSON arrays are available at https://github.com/gdcc/dataverse-external-vocab-support, along with an example metadata block that can be used for testing.
The scripts required can be hosted locally or retrieved dynamically from https://gdcc.github.io/ (similar to how dataverse-previewers work).

Expand Down
5 changes: 5 additions & 0 deletions doc/sphinx-guides/source/api/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ This API changelog is experimental and we would love feedback on its usefulness.
:local:
:depth: 1

v6.3
----

- **/api/admin/superuser/{identifier}**: The POST endpoint that toggles superuser status has been deprecated in favor of a new PUT endpoint that allows you to specify true or false. See :ref:`set-superuser-status`.

v6.2
----

Expand Down
Loading

0 comments on commit 93fb1f5

Please sign in to comment.