Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prtest10 #72

Open
wants to merge 103 commits into
base: sphinx_actions
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
103 commits
Select commit Hold shift + click to select a range
7e7a502
download differentially private statistics #7400
pdurbin Mar 9, 2021
2e89bde
initial docs for DP metadata access
djbrooke Mar 5, 2021
d8e4b0a
update to match text from depositing data
djbrooke Mar 5, 2021
53b4201
Moved aux file download link to clean up float issue [ref #7400]
mheppler Mar 9, 2021
5c8085d
Moved aux file download link out of canDownload render logic [ref #7400]
mheppler Mar 10, 2021
767f2eb
Removed aux file download link debug code and typo [ref #7400]
mheppler Mar 10, 2021
3c0140b
formatTag and formatVersion are working now #7400
pdurbin Mar 10, 2021
0127d5c
Render logic changes to file access btn and dropdown options [ref #7400]
mheppler Mar 10, 2021
fc9381a
Added file access status to dropdown, tooltips to icons, other aux fi…
mheppler Mar 17, 2021
9b7a15d
Fixed render logic on Variable Metadata option under File Access btn …
mheppler Mar 18, 2021
a04b783
refactor tests #7400
pdurbin Mar 23, 2021
b569c3e
add guidelines on research code in User Guide
atrisovic Mar 23, 2021
f0c12ae
Minor style changes, mostly about displaying links
jggautier Mar 24, 2021
006a9ba
move English to bundle #7400
pdurbin Mar 25, 2021
cc00904
put aux files starting with tag "dp" under DP Stats #7400
pdurbin Mar 26, 2021
5a045be
new domain: opendp.org #7400
pdurbin Mar 26, 2021
22f1d2d
Add "dp" rule to guides #7400
pdurbin Mar 26, 2021
439fcb7
add release note #7400
pdurbin Mar 26, 2021
353134f
Merge branch 'develop' into 7400-opendp-download #7400
pdurbin Mar 26, 2021
4cb7394
tiny edits
atrisovic Mar 29, 2021
01741b3
pass isPublic boolean in tests #7400
pdurbin Apr 1, 2021
d368a85
add type for aux files #7400
pdurbin Apr 5, 2021
289ef85
Merge branch 'develop' into 7400-opendp-download #7400
pdurbin Apr 5, 2021
5c67f95
add file extension to aux files on download #7400
pdurbin Apr 6, 2021
dcc9104
Merge branch 'develop' into 7400-opendp-download #7400
pdurbin Apr 6, 2021
e419194
update release note to reflect recent changes #7400
pdurbin Apr 6, 2021
92af221
add AuxiliaryFilesIT to test suite script #7400
pdurbin Apr 6, 2021
a1e852a
prevent anon download of aux files in draft #7400
pdurbin Apr 7, 2021
9ec7eac
per review comments
qqmyers Apr 7, 2021
739fb2b
add more guidelines and info on repro platforms
atrisovic Apr 8, 2021
c610321
Merge branch 'develop' into 7400-opendp-download #7400
pdurbin Apr 8, 2021
37b280d
#7779 update docker-aio scripts to run more cleanly
Apr 8, 2021
9858a78
#7784 If name is not set use 'Dataverse administrator' as email sender
pkiraly Apr 12, 2021
3b2ea4f
add missing "metadata" from aux file download path #7400
pdurbin Apr 12, 2021
6210db5
#7784 refactoring, documenting and unit testing
pkiraly Apr 12, 2021
8ae7c54
#7784 using getInstallationBrandName() and a new bundle key
pkiraly Apr 13, 2021
3e44a9d
Merge remote-tracking branch 'IQSS/develop' into IQSS-7586-stacktrace…
qqmyers Apr 13, 2021
25d7725
#7784 fixing test with Mockito
pkiraly Apr 13, 2021
0bd01ad
Escape description for use in datacite xml (file and datacite api call)
qqmyers Apr 13, 2021
9e5f390
base "other" type/grouping on absence from bundle #7400
pdurbin Apr 13, 2021
f2e5784
dirindex doc should show use of DOI for end users
mankoff Apr 14, 2021
469b96b
Update doc/sphinx-guides/source/api/native-api.rst
mankoff Apr 14, 2021
c0071a5
Update doc/sphinx-guides/source/api/native-api.rst
mankoff Apr 14, 2021
9115380
add type=null aux files to "other" list #7400
pdurbin Apr 14, 2021
3441758
move SQL to named queries #7400
pdurbin Apr 14, 2021
c5c210a
Merge pull request #7780 from OdumInstitute/7779_update_docker_aio_sc…
kcondon Apr 15, 2021
f55b368
switch from "every version" check to "is file published" #7400
pdurbin Apr 15, 2021
9b852de
refactor getFileExtension into own method #7400
pdurbin Apr 15, 2021
6f8ddc9
make "File Access: " its own entry in bundle #7400
pdurbin Apr 15, 2021
7903a8f
remove contradictory render logic #7400
pdurbin Apr 15, 2021
98fdaef
prevent constant "missing bundle key" messages in server.log #7400
pdurbin Apr 15, 2021
8162886
remove TODO, allowAccessRequests is working #7400
pdurbin Apr 15, 2021
20b4382
remove reference to type=OTHER (legacy concept) in release note #7400
pdurbin Apr 15, 2021
5dc7234
reword "type" on aux file page #7400
pdurbin Apr 15, 2021
51cffe8
remove cruft (tmp file that got committed) #7400
pdurbin Apr 15, 2021
b132580
remove duplicate query and refactor #7400
pdurbin Apr 15, 2021
7ea9ffa
Merge branch 'develop' into 7400-opendp-download #7400
pdurbin Apr 15, 2021
46d1bd4
rename SQL script (5.4.1 is out) #7400
pdurbin Apr 15, 2021
20654cd
minor tweak to wording
scolapasta Apr 15, 2021
d277aa1
standardize on "Restricted with Access Granted" #7400
pdurbin Apr 16, 2021
dab2d8b
make it clear that API might change #7400
pdurbin Apr 16, 2021
ed0b037
Minor text tweaking
scolapasta Apr 16, 2021
e1d5541
switch SQL query from LIKE to = #7400
pdurbin Apr 16, 2021
e913a23
remove duplicate bundle entries #7400
pdurbin Apr 16, 2021
43f3ab4
remove old printStackTrace
qqmyers Apr 20, 2021
3257cd3
Merge pull request #7772 from GlobalDataverseCommunityConsortium/IQSS…
kcondon Apr 21, 2021
3859ba3
Make geospatial coverage "Other" field facetable
jggautier Apr 21, 2021
17b422f
#7784 restoring getInstallationBrandName() to its original version
pkiraly Apr 21, 2021
a0d4713
compare type against orig file (if ingested)
qqmyers Apr 21, 2021
cf7b6d1
add note that aux file APIs can be blocked #7400
pdurbin Apr 22, 2021
1914e76
Use variables rather than hard-coded dataset
mankoff Apr 22, 2021
ffc105d
Merge pull request #7729 from IQSS/7400-opendp-download
kcondon Apr 22, 2021
60ae228
rename
qqmyers Apr 22, 2021
c78caeb
Create 7399-geospatial-update.md
jggautier Apr 22, 2021
f3f4545
preliminary/draft performance fix for the download-all api. it works …
landreev Apr 22, 2021
ad341af
Getting rid of GetDatasetCommand in the download api (since it appear…
landreev Apr 22, 2021
09daaba
Set execute bit on updateSchemaMDB.sh
janvanmansum Apr 23, 2021
9486b5b
Merge pull request #7800 from mankoff/minor-doc-edit
kcondon Apr 23, 2021
3b98994
Merge pull request #7818 from QualitativeDataRepository/IQSS/7817_com…
kcondon Apr 23, 2021
98579a6
Merge pull request #7798 from QualitativeDataRepository/IQSS/3328-esc…
kcondon Apr 23, 2021
3a6d8a9
Merge branch 'develop' into 7784-prevent-null-as-email-sender
pkiraly Apr 26, 2021
4571e38
Merge pull request #7785 from pkiraly/7784-prevent-null-as-email-sender
kcondon Apr 26, 2021
c505506
Merge pull request #7822 from DANS-KNAW/REPAIR_EXE_BIT_updateSchemaMDB
kcondon Apr 26, 2021
f116313
Bump commons-io from 2.6 to 2.7
dependabot[bot] Apr 26, 2021
bf30a70
Merge pull request #7825 from IQSS/dependabot/maven/commons-io-common…
kcondon Apr 28, 2021
d7303de
#7233 update Sphinx to 3.5.4, jQuery to 3.5.1
Apr 28, 2021
4653423
use correct duration for dataset publication locks
pdurbin Apr 28, 2021
4a30c1d
Merge pull request #7813 from IQSS/7399-make-geospatial-coverage-"Oth…
kcondon Apr 28, 2021
8f3cbd6
Update dataset-management.rst
kcondon Apr 29, 2021
ca0a01b
tiny fix in URLs
atrisovic Apr 29, 2021
7909806
Merge branch 'develop' into docs-research-code
atrisovic Apr 29, 2021
dd83396
Merge pull request #7717 from atrisovic/docs-research-code
kcondon Apr 29, 2021
5d75cac
Merge pull request #7833 from IQSS/sleep-for-lock
kcondon Apr 29, 2021
c92a658
final (?) cleanup for #7812. this further simplifies the solution,
landreev Apr 29, 2021
f8291bc
Merge branch 'develop' into 7812-download-all-api-speedup
landreev Apr 29, 2021
ab8dde7
#7233 remove comments from unnecessary list
Apr 30, 2021
1795593
Changed the return code on a non-existing dataset request back to wha…
landreev Apr 30, 2021
44f4b3d
fixing a typo-like issue with a comment in the api class (#7812)
landreev Apr 30, 2021
4a9ac4e
Merge pull request #7832 from OdumInstitute/7233_update_sphinx_jquery
kcondon May 3, 2021
f697cb2
#7837 add basic Sphinx-build GitHub Action and accompanying status badge
May 3, 2021
bd202ef
Merge pull request #7836 from IQSS/7812-download-all-api-speedup
kcondon May 3, 2021
0bd0691
Merge pull request #7838 from OdumInstitute/7837_basic_sphinx_github_…
kcondon May 3, 2021
a6e23cb
test file10
May 3, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .github/workflows/guides_build_sphinx.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
name: "Guides Build Status"
on:
- pull_request

jobs:
docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: ammaraskar/sphinx-action@master
with:
docs-folder: "doc/sphinx-guides/source/"
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Dataverse is a trademark of President and Fellows of Harvard College and is regi
[![API Test Coverage](https://img.shields.io/jenkins/coverage/jacoco?jobUrl=https%3A%2F%2Fjenkins.dataverse.org%2Fjob%2FIQSS-dataverse-develop&label=API%20Test%20Coverage)](https://jenkins.dataverse.org/job/IQSS-dataverse-develop/ws/target/coverage-it/index.html)
[![Unit Test Status](https://img.shields.io/travis/IQSS/dataverse?label=Unit%20Test%20Status)](https://travis-ci.org/IQSS/dataverse)
[![Unit Test Coverage](https://img.shields.io/coveralls/github/IQSS/dataverse?label=Unit%20Test%20Coverage)](https://coveralls.io/github/IQSS/dataverse?branch=develop)
[![Guides Build Status](https://github.com/IQSS/dataverse/actions/workflows/guides_build_sphinx.yml/badge.svg)](https://github.com/IQSS/dataverse/actions/workflows/guides_build_sphinx.yml)

[dataverse.org]: https://dataverse.org
[demo.dataverse.org]: https://demo.dataverse.org
Expand Down
7 changes: 6 additions & 1 deletion conf/docker-aio/c8.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,14 @@ FROM centos:8
# OS dependencies
# PG 10 is the default in centos8; keep the repo comment for when we bump to 11+
#RUN yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
RUN yum install -y java-11-openjdk-devel postgresql-server sudo epel-release unzip curl httpd

RUN echo "fastestmirror=true" >> /etc/dnf/dnf.conf
RUN yum install -y java-11-openjdk-devel postgresql-server sudo epel-release unzip curl httpd python2 diffutils
RUN yum install -y jq lsof awscli

# for older search scripts
RUN ln -s /usr/bin/python2 /usr/bin/python

# copy and unpack dependencies (solr, payara)
COPY dv /tmp/dv
COPY testdata/schema*.xml /tmp/dv/
Expand Down
12 changes: 6 additions & 6 deletions conf/docker-aio/configure_doi.bash
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
#!/usr/bin/env bash

cd /usr/local/glassfish4
cd /opt/payara5

# if appropriate; reconfigure PID provider on the basis of environmental variables.
if [ ! -z "${DoiProvider}" ]; then
curl -X PUT -d ${DoiProvider} http://localhost:8080/api/admin/settings/:DoiProvider
fi
if [ ! -z "${doi_username}" ]; then
bin/asadmin create-jvm-options "-Ddoi.username=${doi_password}"
bin/asadmin create-jvm-options "-Ddoi.username=${doi_username}"
fi
if [ ! -z "${doi_password}" ]; then
bin/asadmin create-jvm-options "-Ddoi.password=${doi_password}"
fi
if [ ! -z "${doi_baseurl}" ]; then
bin/asadmin delete-jvm-options "-Ddoi.baseurlstring=https\://mds.test.datacite.org"
doi_baseurl_esc=`echo ${doi_baseurl} | sed -e 's/:/\\:/'`
bin/asadmin create-jvm-options "\"-Ddoi.baseurlstring=${doi_baseurl_esc}\""
doi_baseurl_esc=`echo ${doi_baseurl} | sed -e 's/:/\\\:/'`
bin/asadmin create-jvm-options "-Ddoi.baseurlstring=${doi_baseurl_esc}"
fi
if [ ! -z "${doi_dataciterestapiurl}" ]; then
bin/asadmin delete-jvm-options "-Ddoi.dataciterestapiurlstring=https\://api.test.datacite.org"
doi_dataciterestapiurl_esc=`echo ${doi_dataciterestapiurl} | sed -e 's/:/\\:/'`
bin/asadmin create-jvm-options "\"-Ddoi.dataciterestapiurlstring=${doi_dataciterestapiurl_esc}\""
doi_dataciterestapiurl_esc=`echo ${doi_dataciterestapiurl} | sed -e 's/:/\\\:/'`
bin/asadmin create-jvm-options "-Ddoi.dataciterestapiurlstring=${doi_dataciterestapiurl_esc}"
fi
2 changes: 1 addition & 1 deletion conf/docker-aio/run-test-suite.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ fi

# Please note the "dataverse.test.baseurl" is set to run for "all-in-one" Docker environment.
# TODO: Rather than hard-coding the list of "IT" classes here, add a profile to pom.xml.
source maven/maven.sh && mvn test -Dtest=DataversesIT,DatasetsIT,SwordIT,AdminIT,BuiltinUsersIT,UsersIT,UtilIT,ConfirmEmailIT,FileMetadataIT,FilesIT,SearchIT,InReviewWorkflowIT,HarvestingServerIT,MoveIT,MakeDataCountApiIT,FileTypeDetectionIT,EditDDIIT,ExternalToolsIT,AccessIT,DuplicateFilesIT,DownloadFilesIT,LinkIT,DeleteUsersIT,DeactivateUsersIT -Ddataverse.test.baseurl=$dvurl
source maven/maven.sh && mvn test -Dtest=DataversesIT,DatasetsIT,SwordIT,AdminIT,BuiltinUsersIT,UsersIT,UtilIT,ConfirmEmailIT,FileMetadataIT,FilesIT,SearchIT,InReviewWorkflowIT,HarvestingServerIT,MoveIT,MakeDataCountApiIT,FileTypeDetectionIT,EditDDIIT,ExternalToolsIT,AccessIT,DuplicateFilesIT,DownloadFilesIT,LinkIT,DeleteUsersIT,DeactivateUsersIT,AuxiliaryFilesIT -Ddataverse.test.baseurl=$dvurl
4 changes: 2 additions & 2 deletions conf/docker-aio/testscripts/install
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/bin/sh
export HOST_ADDRESS=localhost
export GLASSFISH_ROOT=/usr/local/glassfish4
export FILES_DIR=/usr/local/glassfish4/glassfish/domains/domain1/files
export GLASSFISH_ROOT=/opt/payara5
export FILES_DIR=/opt/payara5/glassfish/domains/domain1/files
export DB_NAME=dvndb
export DB_PORT=5432
export DB_HOST=localhost
Expand Down
1 change: 0 additions & 1 deletion conf/docker-aio/testscripts/post
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
cd scripts/api
./setup-all.sh --insecure -p=admin1 | tee /tmp/setup-all.sh.out
cd ../..
psql -U dvnapp dvndb -f scripts/database/reference_data.sql
psql -U dvnapp dvndb -f doc/sphinx-guides/source/_static/util/createsequence.sql
scripts/search/tests/publish-dataverse-root
#git checkout scripts/api/data/dv-root.json
Expand Down
Empty file modified conf/solr/8.8.1/updateSchemaMDB.sh
100644 → 100755
Empty file.
3 changes: 3 additions & 0 deletions doc/release-notes/7399-geospatial-update.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
### Geospatial Metadata Block Updated

The Geospatial metadata block (geospatial.tsv) was updated. Dataverse collection administrators can now add a search facet on their collection pages for the metadata block's "Other" field, so that people searching in their collections can narrow searches using the values entered in that field.
12 changes: 12 additions & 0 deletions doc/release-notes/7400-opendp-download.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Auxiliary Files can now be downloaded from the web interface.

- Aux files uploaded as type=DP appear under "Differentially Private Statistics" under file level download. The rest appear under "Other Auxiliary Files".

In addition, related changes were made, including the following:

- New tooltip over the lock indicating if you have been granted access to a restricted file or not.
- When downloading individual files, you will see "Restricted with Access Granted" or just "Restricted" (followed by "Users may not request access to files.") as appropriate.
- When downloading individual files, instead of "Download" you should expect to see the file type such as "JPEG Image" or "Original File Format" if the type is unknown.
- Downloaded aux files now have a file extension if it can be determined.

Please note that the auxiliary files feature is experimental and if you don't need it, its API endpoints can be blocked.
Empty file added doc/sphinx-guides/prtest10
Empty file.
4 changes: 2 additions & 2 deletions doc/sphinx-guides/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
# match version used by Jenkins
Sphinx==1.5.6
# current version as of this writing
Sphinx==3.5.4
11 changes: 7 additions & 4 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -795,7 +795,10 @@ View Dataset Files and Folders as a Directory Index

.. code-block:: bash

curl $SERVER_URL/api/datasets/$ID/dirindex/
curl $SERVER_URL/api/datasets/${ID}/dirindex/
# or
curl ${SERVER_URL}/api/datasets/:persistentId/dirindex?persistentId=doi:${PERSISTENT_ID}


Optional parameters:

Expand Down Expand Up @@ -867,7 +870,9 @@ An example of a ``wget`` command line for crawling ("recursive downloading") of

.. code-block:: bash

wget -r -e robots=off -nH --cut-dirs=3 --content-disposition https://demo.dataverse.org/api/datasets/24/dirindex/
wget -r -e robots=off -nH --cut-dirs=3 --content-disposition https://demo.dataverse.org/api/datasets/${ID}/dirindex/
# or
wget -r -e robots=off -nH --cut-dirs=3 --content-disposition https://demo.dataverse.org/api/datasets/:persistentId/dirindex?persistentId=doi:${PERSISTENT_ID}

.. note:: In addition to the files and folders in the dataset, the command line above will also save the directory index of each folder, in a separate folder "dirindex".

Expand Down Expand Up @@ -3487,5 +3492,3 @@ Recursively applies the role assignments of the specified Dataverse collection,
GET http://$SERVER/api/admin/dataverse/{dataverse alias}/addRoleAssignmentsToChildren

Note: setting ``:InheritParentRoleAssignments`` will automatically trigger inheritance of the parent Dataverse collection's role assignments for a newly created Dataverse collection. Hence this API call is intended as a way to update existing child Dataverse collections or to update children after a change in role assignments has been made on a parent Dataverse collection.


4 changes: 1 addition & 3 deletions doc/sphinx-guides/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -224,9 +224,7 @@
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']

html_js_files = [
'js/jquery-3.4.1.min.js',
]
#html_js_files = []

# Add any extra paths that contain custom files (such as robots.txt or
# .htaccess) here, relative to this directory. These files are copied
Expand Down
9 changes: 5 additions & 4 deletions doc/sphinx-guides/source/developers/aux-file-support.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
Auxiliary File Support
======================

Auxiliary file support is experimental. Auxiliary files in the Dataverse Software are being added to support depositing and downloading differentially private metadata, as part of the OpenDP project (OpenDP.io). In future versions, this approach may become more broadly used and supported.
Auxiliary file support is experimental and as such, related APIs may be added, changed or removed without standard backward compatibility. Auxiliary files in the Dataverse Software are being added to support depositing and downloading differentially private metadata, as part of the OpenDP project (opendp.org). In future versions, this approach will likely become more broadly used and supported.

Adding an Auxiliary File to a Datafile
--------------------------------------
To add an auxiliary file, specify the primary key of the datafile (FILE_ID), and the formatTag and formatVersion (if applicable) associated with the auxiliary file. There are two form parameters. "Origin" specifies the application/entity that created the auxiliary file, an "isPublic" controls access to downloading the file. If "isPublic" is true, any user can download the file, else, access authorization is based on the access rules as defined for the DataFile itself.
To add an auxiliary file, specify the primary key of the datafile (FILE_ID), and the formatTag and formatVersion (if applicable) associated with the auxiliary file. There are multiple form parameters. "Origin" specifies the application/entity that created the auxiliary file, and "isPublic" controls access to downloading the file. If "isPublic" is true, any user can download the file if the dataset has been published, else, access authorization is based on the access rules as defined for the DataFile itself. The "type" parameter is used to group similar auxiliary files in the UI. Currently, auxiliary files with type "DP" appear under "Differentially Private Statistics", while all other auxiliary files appear under "Other Auxiliary Files".

.. code-block:: bash

Expand All @@ -14,9 +14,10 @@ To add an auxiliary file, specify the primary key of the datafile (FILE_ID), and
export FILE_ID='12345'
export FORMAT_TAG='dpJson'
export FORMAT_VERSION='v1'
export TYPE='DP'
export SERVER_URL=https://demo.dataverse.org

curl -H X-Dataverse-key:$API_TOKEN -X POST -F "file=@$FILENAME" -F 'origin=myApp' -F 'isPublic=true' "$SERVER_URL/api/access/datafile/$FILE_ID/metadata/$FORMAT_TAG/$FORMAT_VERSION"
curl -H X-Dataverse-key:$API_TOKEN -X POST -F "file=@$FILENAME" -F 'origin=myApp' -F 'isPublic=true' -F "type=$TYPE" "$SERVER_URL/api/access/datafile/$FILE_ID/metadata/$FORMAT_TAG/$FORMAT_VERSION"

You should expect a 200 ("OK") response and JSON with information about your newly uploaded auxiliary file.

Expand All @@ -33,4 +34,4 @@ formatTag and formatVersion (if applicable) associated with the auxiliary file:
export FORMAT_TAG='dpJson'
export FORMAT_VERSION='v1'

curl "$SERVER_URL/api/access/datafile/$FILE_ID/$FORMAT_TAG/$FORMAT_VERSION"
curl "$SERVER_URL/api/access/datafile/$FILE_ID/metadata/$FORMAT_TAG/$FORMAT_VERSION"
51 changes: 51 additions & 0 deletions doc/sphinx-guides/source/user/dataset-management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,38 @@ Additional download options available for tabular data (found in the same drop-d
- Data File Citation (currently in either RIS, EndNote XML, or BibTeX format);
- All of the above, as a zipped bundle.

Differentially Private (DP) Metadata can also be accessed for restricted tabular files if the data depositor has created a DP Metadata Release. See :ref:`dp-release-create` for more information.

Research Code
-------------

Code files - such as Stata, R, MATLAB, or Python files or scripts - have become a frequent addition to the research data deposited in Dataverse repositories. Research code is typically developed by few researchers with the primary goal of obtaining results, while its reproducibility and reuse aspects are sometimes overlooked. Because several independent studies reported issues trying to rerun research code, please consider the following guidelines if your dataset contains code.

The following are general guidelines applicable to all programming languages.

- Create a README text file in the top-level directory to introduce your project. It should answer questions that reviewers or reusers would likely have, such as how to install and use your code. If in doubt, consider using existing templates such as `a README template for social science replication packages <https://social-science-data-editors.github.io/template_README/template-README.html>`_.
- Depending on the number of files in your dataset, consider having data and code in distinct directories, each of which should have some documentation like a README.
- Consider adding a license to your source code. You can do that by creating a LICENSE file in the dataset or by specifying the license(s) in the README or directly in the code. Find out more about code licenses at `the Open Source Initiative webpage <https://opensource.org/licenses>`_.
- If possible, use free and open-source file formats and software to make your research outputs more reusable and accessible.
- Consider testing your code in a clean environment before sharing it, as it could help you identify missing files or other errors. For example, your code should use relative file paths instead of absolute (or full) file paths, as they can cause an execution error.
- Consider providing notes (in the README) on the expected code outputs or adding tests in the code, which would ensure that its functionality is intact.

Capturing code dependencies will help other researchers recreate the necessary runtime environment. Without it, your code will not be able to run correctly (or at all).
One option is to use platforms such as `Whole Tale <https://wholetale.org>`_, `Jupyter Binder <https://mybinder.org>`_ or `Renku <https://renkulab.io>`_, which facilitate research reproducibility. Have a look at `Dataverse Integrations <https://guides.dataverse.org/en/5.4/admin/integrations.html>`_ for more information.
Another option is to use an automatic code dependency capture, which is often supported through the programming language. Here are a few examples:

- If you are using the conda package manager, you can export your environment with the command ``conda env export > environment.yml``. For more information, see the `official documentation <https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#sharing-an-environment>`__.
- Python has multiple conventions for capturing its dependencies, but probably the best-known one is with the ``requirements.txt`` file, which is created using the command ``pip freeze > requirements. txt``. Managing environments with ``pip`` is explained in the `official documentation <https://docs.python.org/3/tutorial/venv.html#managing-packages-with-pip>`__.
- If you are using the R programming language, create a file called ``install.R``, and list all library dependencies that your code requires. This file should be executable in R to set up the environment. See also other strategies for capturing the environment proposed by RStudio in the `official documentation <https://environments.rstudio.com>`__.
- In case you are using multiple programming languages or different versions of the same language, consider using a containerization technology such as Docker. You can create a Dockerfile that builds your environment and deposit it within your dataset (see `the official documentation <https://docs.docker.com/language/python/build-images/>`__). It is worth noting that creating a reliable Dockerfile may be tricky. If you choose this route, make sure to specify dependency versions and check out `Docker's best practices <https://docs.docker.com/develop/develop-images/dockerfile_best-practices/>`_.

Finally, automating your code can be immensely helpful to the code and research reviewers. Here are a few options on how to automate your code.

- A simple way to automate your code is using a bash script or Make. The Turing Way Community has `a detailed guide <https://the-turing-way.netlify.app/reproducible-research/make.html>`_ on how to use the Make build automation tool.
- Consider using research workflow tools to automate your analysis. A popular workflow tool is called Common Workflow Language, and you can find more information about it `from the Common Workflow Language User Guide <https://www.commonwl.org/user_guide/>`_.

**Note:** Capturing code dependencies and automating your code will create new files in your directory. Make sure to include them when depositing your dataset.

Astronomy (FITS)
----------------

Expand Down Expand Up @@ -210,6 +242,8 @@ Restricted Files

When you restrict a file it cannot be downloaded unless permission has been granted.

Differentially Private (DP) Metadata can be accessed for restricted tabular files if the data depositor has created a DP Metadata Release. See :ref:`dp-release-create` for more information.

See also :ref:`terms-of-access` and :ref:`permissions`.

Edit Files
Expand Down Expand Up @@ -302,6 +336,23 @@ If you restrict any files in your dataset, you will be prompted by a pop-up to e

See also :ref:`restricted-files`.

.. _dp-release-create:

Creating and Depositing Differentially Private Metadata (Experimental)
----------------------------------------------------------------------

Through an integration with tools from the OpenDP Project (opendp.org), the Dataverse Software offers an experimental workflow that allows a data depositor to create and deposit Differentially Private (DP) Metadata files, which can then be used for exploratory data analysis. This workflow allows researchers to view the DP metadata for a tabular file, determine whether or not the file contains useful information, and then make an informed decision about whether or not to request access to the original file.

If this integration has been enabled in your Dataverse installation, you can follow these steps to create a DP Metadata Release and make it available to researchers, while still keeping the files themselves restricted and able to be accessed after a successful access request.

- Deposit a tabular file and let the ingest process complete
- Restrict the File
- In the kebab next to the file on the dataset page, or from the "Edit Files" dropdown on the file page, click "OpenDP Tool"
- Go through the process to create a DP Metadata Release in the OpenDP tool, and at the end of the process deposit the DP Metadata Release back to the Dataverse installation
- Publish the Dataset

Once the dataset is published, users will be able to request access using the normal process, but will also have the option to download DP Statistics in order to get more information about the file.

Guestbook
---------

Expand Down
Loading