Skip to content

Commit

Permalink
Merge branch 'IQSS:develop' into IQSS/8417-display-of-dataset-creator…
Browse files Browse the repository at this point in the history
…-permission
  • Loading branch information
haarli authored Feb 11, 2022
2 parents 1cc1f2a + 4a63369 commit 3adc129
Show file tree
Hide file tree
Showing 214 changed files with 9,128 additions and 2,985 deletions.
17 changes: 10 additions & 7 deletions .github/workflows/maven_unit_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,19 @@ on:

jobs:
unittest:
name: (JDK ${{ matrix.jdk }} / ${{ matrix.os }}) Unit Tests
name: (${{ matrix.status}} / JDK ${{ matrix.jdk }}) Unit Tests
strategy:
fail-fast: false
matrix:
os: [ ubuntu-latest ]
jdk: [ '11' ]
#include:
# - os: ubuntu-latest
# jdk: '16'
runs-on: ${{ matrix.os }}
experimental: [false]
status: ["Stable"]
include:
- jdk: '17'
experimental: true
status: "Experimental"
continue-on-error: ${{ matrix.experimental }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up JDK ${{ matrix.jdk }}
Expand All @@ -34,7 +37,7 @@ jobs:
key: ${{ runner.os }}-m2-${{ hashFiles('**/pom.xml') }}
restore-keys: ${{ runner.os }}-m2
- name: Build with Maven
run: mvn -DcompilerArgument=-Xlint:unchecked -P all-unit-tests clean test
run: mvn -DcompilerArgument=-Xlint:unchecked -Dtarget.java.version=${{ matrix.jdk }} -P all-unit-tests clean test
- name: Maven Code Coverage
env:
CI_NAME: github
Expand Down
2 changes: 1 addition & 1 deletion checkstyle.xml
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@
</module>
-->
<module name="IllegalImport">
<property name="illegalPkgs" value="org.apache.commons.lang"/>
<property name="illegalPkgs" value="org.apache.commons.lang, org.apache.log4j"/>
</module>
<!-- <module name="RedundantImport"/> -->
<!-- <module name="UnusedImports">
Expand Down
1 change: 0 additions & 1 deletion conf/docker-aio/0prep_deps.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ wdir=`pwd`

if [ ! -e dv/deps/payara-5.2021.5.zip ]; then
echo "payara dependency prep"
# no more fiddly patching :)
wget https://s3-eu-west-1.amazonaws.com/payara.fish/Payara+Downloads/5.2021.5/payara-5.2021.5.zip -O dv/deps/payara-5.2021.5.zip
fi

Expand Down
6 changes: 3 additions & 3 deletions conf/docker-aio/1prep.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ cd ../../
cp -r scripts conf/docker-aio/testdata/
cp doc/sphinx-guides/source/_static/util/createsequence.sql conf/docker-aio/testdata/doc/sphinx-guides/source/_static/util/

wget -q https://downloads.apache.org/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz
tar xfz apache-maven-3.6.3-bin.tar.gz
wget -q https://downloads.apache.org/maven/maven-3/3.8.4/binaries/apache-maven-3.8.4-bin.tar.gz
tar xfz apache-maven-3.8.4-bin.tar.gz
mkdir maven
mv apache-maven-3.6.3/* maven/
mv apache-maven-3.8.4/* maven/
echo "export JAVA_HOME=/usr/lib/jvm/jre-openjdk" > maven/maven.sh
echo "export M2_HOME=../maven" >> maven/maven.sh
echo "export MAVEN_HOME=../maven" >> maven/maven.sh
Expand Down
2 changes: 1 addition & 1 deletion conf/docker-aio/c8.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ RUN cd /opt ; unzip /tmp/dv/deps/payara-5.2021.5.zip ; ln -s /opt/payara5 /opt/g
# this dies under Java 11, do we keep it?
#COPY domain-restmonitor.xml /opt/payara5/glassfish/domains/domain1/config/domain.xml

RUN sudo -u postgres /usr/pgsql-13/bin/initdb -D /var/lib/pgsql/13/data
RUN sudo -u postgres /usr/pgsql-13/bin/initdb -D /var/lib/pgsql/13/data -E 'UTF-8'

# copy configuration related files
RUN cp /tmp/dv/pg_hba.conf /var/lib/pgsql/13/data/
Expand Down
10 changes: 6 additions & 4 deletions conf/vagrant/etc/yum.repos.d/shibboleth.repo
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
[security_shibboleth]
name=Shibboleth (CentOS_8)
[shibboleth]
name=Shibboleth (rockylinux8)
# Please report any problems to https://shibboleth.atlassian.net/jira
type=rpm-md
baseurl=http://download.opensuse.org/repositories/security:/shibboleth/CentOS_8/
mirrorlist=https://shibboleth.net/cgi-bin/mirrorlist.cgi/rockylinux8
gpgcheck=1
gpgkey=http://download.opensuse.org/repositories/security:/shibboleth/CentOS_8/repodata/repomd.xml.key
gpgkey=https://shibboleth.net/downloads/service-provider/RPMS/repomd.xml.key
https://shibboleth.net/downloads/service-provider/RPMS/cantor.repomd.xml.key
enabled=1
2 changes: 1 addition & 1 deletion doc/release-notes/5.8-release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This release brings new features, enhancements, and bug fixes to the Dataverse S

### Support for Data Embargoes

The Dataverse Software now supports file-level embargoes. The ability to set embargoes, up to a maximum duration (in months), can be configured by a Dataverse installation administrator. For more information, see the [Embargoes section](https://guides.dataverse.org/en/5.8/user/dataset-management.rst#embargoes) of the Dataverse Software Guides.
The Dataverse Software now supports file-level embargoes. The ability to set embargoes, up to a maximum duration (in months), can be configured by a Dataverse installation administrator. For more information, see the [Embargoes section](https://guides.dataverse.org/en/5.8/user/dataset-management.html#embargoes) of the Dataverse Software Guides.

- Users can configure a specific embargo, defined by an end date and a short reason, on a set of selected files or an individual file, by selecting the 'Embargo' menu item and entering information in a popup dialog. Embargoes can only be set, changed, or removed before a file has been published. After publication, only Dataverse installation administrators can make changes, using an API.

Expand Down
171 changes: 171 additions & 0 deletions doc/release-notes/5.9-release-notes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# Dataverse Software 5.9

This release brings new features, enhancements, and bug fixes to the Dataverse Software. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project.

## Release Highlights

### Dataverse Collection Page Optimizations

The Dataverse Collection page, which also serves as the search page and the homepage in most Dataverse installations, has been optimized, with a specific focus on reducing the number of queries for each page load. These optimizations will be more noticable on Dataverse installations with higher traffic.

### Support for HTTP "Range" Header for Partial File Downloads

Dataverse now supports the HTTP "Range" header, which allows users to download parts of a file. Here are some examples:

- `bytes=0-9` gets the first 10 bytes.
- `bytes=10-19` gets 10 bytes from the middle.
- `bytes=-10` gets the last 10 bytes.
- `bytes=9-` gets all bytes except the first 10.

Only a single range is supported. For more information, see the [Data Access API](https://guides.dataverse.org/en/5.9/api/dataaccess.html) section of the API Guide.

### Support for Optional External Metadata Validation Scripts

The Dataverse software now allows an installation administrator to provide custom scripts for additional metadata validation when datasets are being published and/or when Dataverse collections are being published or modified. The Harvard Dataverse Repository has been using this mechanism to combat content that violates our Terms of Use, specifically spam content. All the validation or verification logic is defined in these external scripts, thus making it possible for an installation to add checks custom-tailored to their needs.

Please note that only the metadata are subject to these validation checks. This does not check the content of any uploaded files.

For more information, see the [Database Settings](https://guides.dataverse.org/en/5.9/installation/config.html) section of the Guide. The new settings are listed below, in the "New JVM Options and DB Settings" section of these release notes.

### Displaying Author's Identifier as Link

In the dataset page's metadata tab the author's identifier is now displayed as a clickable link, which points to the profile page in the external service (ORCID, VIAF etc.) in cases where the identifier scheme provides a resolvable landing page. If the identifier does not match the expected scheme, a link is not shown.

### Auxiliary File API Enhancements

This release includes updates to the Auxiliary File API. These updates include:

- Auxiliary files can now also be associated with non-tabular files
- Auxiliary files can now be deleted
- Duplicate Auxiliary files can no longer be created
- A new API has been added to list Auxiliary files by their origin
- Some auxiliary were being saved with the wrong content type (MIME type) but now the user can supply the content type on upload, overriding the type that would otherwise be assigned
- Improved error reporting
- A bugfix involving checksums for Auxiliary files

Please note that the Auxiliary files feature is experimental and is designed to support integration with tools from the [OpenDP Project](https://opendp.org). If the API endpoints are not needed they can be blocked.

## Major Use Cases and Infrastructure Enhancements

Newly-supported major use cases in this release include:

- The Dataverse collection page has been optimized, resulting in quicker load times on one of the most common pages in the application (Issue #7804, PR #8143)
- Users will now be able to specify a certain byte range in their downloads via API, allowing for downloads of file parts. (Issue #6397, PR #8087)
- A Dataverse installation administrator can now set up metadata validation for datasets and Dataverse collections, allowing for publish-time and create-time checks for all content. (Issue #8155, PR #8245)
- Users will be provided with clickable links to authors' ORCIDs and other IDs in the dataset metadata (Issue #7978, PR #7979)
- Users will now be able to associate Auxiliary files with non-tabular files (Issue #8235, PR #8237)
- Users will no longer be able to create duplicate Auxiliary files (Issue #8235, PR #8237)
- Users will be able to delete Auxiliary files (Issue #8235, PR #8237)
- Users can retrieve a list of Auxiliary files based on their origin (Issue #8235, PR #8237)
- Users will be able to supply the content type of Auxiliary files on upload (Issue #8241, PR #8282)
- The indexing process has been updated so that datasets with fewer files and indexed first, resulting in fewer failures and making it easier to identify problematically-large datasets. (Issue #8097, PR #8152)
- Users will no longer be able to create metadata records with problematic special characters, which would later require Dataverse installation administrator intervention and a database change (Issue #8018, PR #8242)
- The Dataverse software will now appropriately recognize files with the .geojson extension as GeoJSON files rather than "unknown" (Issue #8261, PR #8262)
- A Dataverse installation administrator can now retrieve more information about role deletion from the ActionLogRecord (Issue #2912, PR #8211)
- Users will be able to use a new role to allow a user to respond to file download requests without also giving them the power to manage the dataset (Issue #8109, PR #8174)
- Users will no longer be forced to update their passwords when moving from Dataverse 3.x to Dataverse 4.x (PR #7916)
- Improved accessibility of buttons on the Dataset and File pages (Issue #8247, PR #8257)

## Notes for Dataverse Installation Administrators

### Indexing Performance on Datasets with Large Numbers of Files

We discovered that whenever a full reindexing needs to be performed, datasets with large numbers of files take an exceptionally long time to index. For example, in the Harvard Dataverse Repository, it takes several hours for a dataset that has 25,000 files. In situations where the Solr index needs to be erased and rebuilt from scratch (such as a Solr version upgrade, or a corrupt index, etc.) this can significantly delay the repopulation of the search catalog.

We are still investigating the reasons behind this performance issue. For now, even though some improvements have been made, a dataset with thousands of files is still going to take a long time to index. In this release, we've made a simple change to the reindexing process, to index any such datasets at the very end of the batch, after all the datasets with fewer files have been reindexed. This does not improve the overall reindexing time, but will repopulate the bulk of the search index much faster for the users of the installation.

### Custom Analytics Code Changes

You should update your custom analytics code to capture a bug fix related to tracking within the dataset files table. This release restores that tracking.

For more information, see the documentation and sample analytics code snippet provided in [Installation Guide](http://guides.dataverse.org/en/5.9/installation/config.html#web-analytics-code). This update can be used on any version 5.4+.

### New ManageFilePermissions Permission

Dataverse can now support a use case in which a Admin or Curator would like to delegate the ability to grant access to restricted files to other users. This can be implemented by creating a custom role (e.g. DownloadApprover) that has the new ManageFilePermissions permission. This release introduces the new permission, and a Flyway script adjusts the existing Admin and Curator roles so they continue to have the ability to grant file download requrests.

### Thumbnail Defaults

New *default* values have been added for the JVM settings `dataverse.dataAccess.thumbnail.image.limit` and `dataverse.dataAccess.thumbnail.pdf.limit`, of 3MB and 1MB respectively. This means that, *unless specified otherwise* by the JVM settings already in your domain configuration, the application will skip attempting to generate thumbnails for image files and PDFs that are above these size limits.
In previous versions, if these limits were not explicitly set, the application would try to create thumbnails for files of unlimited size. Which would occasionally cause problems with very large images.

## New JVM Options and DB Settings

The following DB settings allow configuration of the external metadata validator:

- :DataverseMetadataValidatorScript
- :DataverseMetadataPublishValidationFailureMsg
- :DataverseMetadataUpdateValidationFailureMsg
- :DatasetMetadataValidatorScript
- :DatasetMetadataValidationFailureMsg
- :ExternalValidationAdminOverride

See the [Database Settings](https://guides.dataverse.org/en/5.9/installation/config.html) section of the Guides for more information.

## Notes for Developers and Integrators

Two sections of the Developer Guide have been updated:

- Instructions on how to sync a PR in progress with develop have been added in the version control section
- Guidance on avoiding ineffeciencies in JSF render logic has been added to the "Tips" section

## Complete List of Changes

For the complete list of code changes in this release, see the [5.9 Milestone](https://github.com/IQSS/dataverse/milestone/100?closed=1) in Github.

For help with upgrading, installing, or general questions please post to the [Dataverse Community Google Group](https://groups.google.com/forum/#!forum/dataverse-community) or email [email protected].

## Installation

If this is a new installation, please see our [Installation Guide](https://guides.dataverse.org/en/5.9/installation/). Please also contact us to get added to the [Dataverse Project Map](https://guides.dataverse.org/en/5.9/installation/config.html#putting-your-dataverse-installation-on-the-map-at-dataverse-org) if you have not done so already.

## Upgrade Instructions

0\. These instructions assume that you've already successfully upgraded from Dataverse Software 4.x to Dataverse Software 5 following the instructions in the [Dataverse Software 5 Release Notes](https://github.com/IQSS/dataverse/releases/tag/v5.0). After upgrading from the 4.x series to 5.0, you should progress through the other 5.x releases before attempting the upgrade to 5.9.

If you are running Payara as a non-root user (and you should be!), **remember not to execute the commands below as root**. Use `sudo` to change to that user first. For example, `sudo -i -u dataverse` if `dataverse` is your dedicated application user.

In the following commands we assume that Payara 5 is installed in `/usr/local/payara5`. If not, adjust as needed.

`export PAYARA=/usr/local/payara5`

(or `setenv PAYARA /usr/local/payara5` if you are using a `csh`-like shell)

1\. Undeploy the previous version.

- `$PAYARA/bin/asadmin list-applications`
- `$PAYARA/bin/asadmin undeploy dataverse<-version>`

2\. Stop Payara and remove the generated directory

- `service payara stop`
- `rm -rf $PAYARA/glassfish/domains/domain1/generated`

3\. Start Payara

- `service payara start`

4\. Deploy this version.

- `$PAYARA/bin/asadmin deploy dataverse-5.9.war`

5\. Restart payara

- `service payara stop`
- `service payara start`

6\. Run ReExportall to update JSON Exports

Following the directions in the [Admin Guide](http://guides.dataverse.org/en/5.9/admin/metadataexport.html?highlight=export#batch-exports-through-the-api)

## Additional Release Steps

(for installations collecting web analytics)

1\. Update custom analytics code per the [Installation Guide](http://guides.dataverse.org/en/5.9/installation/config.html#web-analytics-code).

(for installations with GeoJSON files)

1\. Redetect GeoJSON files to update the type from "Unknown" to GeoJSON, following the directions in the [API Guide](https://guides.dataverse.org/en/5.9/api/native-api.html#redetect-file-type)

2\. Kick off full reindex following the directions in the [Admin Guide](http://guides.dataverse.org/en/5.9/admin/solr-search-index.html)
7 changes: 7 additions & 0 deletions doc/release-notes/5733-s3-creds-chain.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Providing S3 Storage Credentials via MicroProfile Config

With this release, you may use two new options to pass an access key identifier and a secret access key for S3-based
storage definitions without creating the files used by the AWS CLI tools (`~/.aws/config` & `~/.aws/credentials`).

This has been added to ease setups using containers (Docker, Podman, Kubernetes, OpenShift) or testing and developing
installations. Find added [documentation and a word of warning in the installation guide](https://guides.dataverse.org/en/latest/installation/config.html#s3-mpconfig).
Loading

0 comments on commit 3adc129

Please sign in to comment.