From 0fb1bf00f9b8fdecdb6e08554fa8fa2b83f56aad Mon Sep 17 00:00:00 2001 From: qqmyers Date: Fri, 14 Oct 2022 11:21:35 -0400 Subject: [PATCH 01/20] commons-text update --- pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pom.xml b/pom.xml index 15ad1aa2c10..c6459cfc55c 100644 --- a/pom.xml +++ b/pom.xml @@ -233,7 +233,7 @@ org.apache.commons commons-text - 1.9 + 1.10.0 org.apache.commons From 15cb16f1c34d329e81d702f09a19caa24aedce1d Mon Sep 17 00:00:00 2001 From: Miguel Tomas Silva Date: Tue, 18 Oct 2022 09:07:26 +0200 Subject: [PATCH 02/20] added C/ C++ library ; organized chronological all libraries avail. --- .../source/api/client-libraries.rst | 31 ++++++++++++------- 1 file changed, 20 insertions(+), 11 deletions(-) diff --git a/doc/sphinx-guides/source/api/client-libraries.rst b/doc/sphinx-guides/source/api/client-libraries.rst index 634f03a8125..9d653c549a6 100755 --- a/doc/sphinx-guides/source/api/client-libraries.rst +++ b/doc/sphinx-guides/source/api/client-libraries.rst @@ -8,14 +8,20 @@ Because a Dataverse installation is a SWORD server, additional client libraries .. contents:: |toctitle| :local: -Python ------- +C / C++ +------- +A C / C++ library to expedite deployment when connecting to a Dataverse API can be found here: +[https://github.com/aeonSolutions/OpenScience-Dataverse-API-C-library](https://github.com/aeonSolutions/OpenScience-Dataverse-API-C-library) -There are two Python modules for interacting with Dataverse Software APIs. +This C/C++ library was initialy coded and is currently maintained by [Miguel T.](https://www.linkedin.com/in/migueltomas/). A features common HTTPS GET and POST requests made to the API in a dataverse. To leanr how to install it and use it, goto the wiki page [here](https://github.com/aeonSolutions/OpenScience-Dataverse-API-C-library/wiki). -`pyDataverse `_ primarily allows developers to manage Dataverse collections, datasets and datafiles. Its intention is to help with data migrations and DevOps activities such as testing and configuration management. The module is developed by `Stefan Kasberger `_ from `AUSSDA - The Austrian Social Science Data Archive `_. -`dataverse-client-python `_ had its initial release in 2015. `Robert Liebowitz `_ created this library while at the `Center for Open Science (COS) `_ and the COS uses it to integrate the `Open Science Framework (OSF) `_ with a Dataverse installation via an add-on which itself is open source and listed on the :doc:`/api/apps` page. +Java +---- + +https://github.com/IQSS/dataverse-client-java is the official Java library for Dataverse Software APIs. + +`Richard Adams `_ from `ResearchSpace `_ created and maintains this library. Javascript ---------- @@ -24,6 +30,15 @@ https://github.com/IQSS/dataverse-client-javascript is the official Javascript p It was created and is maintained by `The Agile Monkeys `_. +Python +------ + +There are two Python modules for interacting with Dataverse Software APIs. + +`pyDataverse `_ primarily allows developers to manage Dataverse collections, datasets and datafiles. Its intention is to help with data migrations and DevOps activities such as testing and configuration management. The module is developed by `Stefan Kasberger `_ from `AUSSDA - The Austrian Social Science Data Archive `_. + +`dataverse-client-python `_ had its initial release in 2015. `Robert Liebowitz `_ created this library while at the `Center for Open Science (COS) `_ and the COS uses it to integrate the `Open Science Framework (OSF) `_ with a Dataverse installation via an add-on which itself is open source and listed on the :doc:`/api/apps` page. + R - @@ -32,12 +47,6 @@ The R client can search and download datasets. It is useful when automatically ( The package is currently maintained by `Shiro Kuriwaki `_. It was originally created by `Thomas Leeper `_ and then formerly maintained by `Will Beasley `_. -Java ----- - -https://github.com/IQSS/dataverse-client-java is the official Java library for Dataverse Software APIs. - -`Richard Adams `_ from `ResearchSpace `_ created and maintains this library. Ruby ---- From b607feb0c249c4c6b7cea85c12eab48719b109c8 Mon Sep 17 00:00:00 2001 From: Henning Timm Date: Tue, 18 Oct 2022 11:28:26 +0200 Subject: [PATCH 03/20] Removed link to resolved issue --- doc/sphinx-guides/source/admin/solr-search-index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx-guides/source/admin/solr-search-index.rst b/doc/sphinx-guides/source/admin/solr-search-index.rst index 5685672eceb..41b9c7b6a8f 100644 --- a/doc/sphinx-guides/source/admin/solr-search-index.rst +++ b/doc/sphinx-guides/source/admin/solr-search-index.rst @@ -36,7 +36,7 @@ Please note that the moment you issue this command, it will appear to end users Start Async Reindex ~~~~~~~~~~~~~~~~~~~ -Please note that this operation may take hours depending on the amount of data in your system. This known issue is being tracked at https://github.com/IQSS/dataverse/issues/50 +Please note that this operation may take hours depending on the amount of data in your system. ``curl http://localhost:8080/api/admin/index`` From 0a11e454a89912704389b694e45ba5a3fcf10171 Mon Sep 17 00:00:00 2001 From: Henning Timm Date: Tue, 18 Oct 2022 11:28:48 +0200 Subject: [PATCH 04/20] Made Solr casing consistent --- doc/sphinx-guides/source/admin/solr-search-index.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/sphinx-guides/source/admin/solr-search-index.rst b/doc/sphinx-guides/source/admin/solr-search-index.rst index 41b9c7b6a8f..ef661c14ef9 100644 --- a/doc/sphinx-guides/source/admin/solr-search-index.rst +++ b/doc/sphinx-guides/source/admin/solr-search-index.rst @@ -60,7 +60,7 @@ If indexing stops, this command should pick up where it left off based on which Manual Reindexing ----------------- -If you have made manual changes to a dataset in the database or wish to reindex a dataset that solr didn't want to index properly, it is possible to manually reindex Dataverse collections and datasets. +If you have made manual changes to a dataset in the database or wish to reindex a dataset that Solr didn't want to index properly, it is possible to manually reindex Dataverse collections and datasets. Reindexing Dataverse Collections ++++++++++++++++++++++++++++++++ @@ -89,7 +89,7 @@ To re-index a dataset by its database ID: Manually Querying Solr ---------------------- -If you suspect something isn't indexed properly in solr, you may bypass the Dataverse installation's web interface and query the command line directly to verify what solr returns: +If you suspect something isn't indexed properly in Solr, you may bypass the Dataverse installation's web interface and query the command line directly to verify what Solr returns: ``curl "http://localhost:8983/solr/collection1/select?q=dsPersistentId:doi:10.15139/S3/HFV0AO"`` From 6da6fd06e668e2edef4bca4d7369b4528cf1d92e Mon Sep 17 00:00:00 2001 From: Henning Timm Date: Tue, 18 Oct 2022 11:29:16 +0200 Subject: [PATCH 05/20] Minor punctuation fix --- doc/sphinx-guides/source/admin/solr-search-index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx-guides/source/admin/solr-search-index.rst b/doc/sphinx-guides/source/admin/solr-search-index.rst index ef661c14ef9..faf1a578387 100644 --- a/doc/sphinx-guides/source/admin/solr-search-index.rst +++ b/doc/sphinx-guides/source/admin/solr-search-index.rst @@ -22,7 +22,7 @@ Get a list of all database objects that are missing in Solr, and Solr documents ``curl http://localhost:8080/api/admin/index/status`` -Remove all Solr documents that are orphaned (ie not associated with objects in the database): +Remove all Solr documents that are orphaned (i.e. not associated with objects in the database): ``curl http://localhost:8080/api/admin/index/clear-orphans`` From e3f3fb4e7209d9917c7484a570bb6eac8916a777 Mon Sep 17 00:00:00 2001 From: Henning Timm Date: Tue, 18 Oct 2022 11:29:38 +0200 Subject: [PATCH 06/20] Highlighted search string --- doc/sphinx-guides/source/admin/solr-search-index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx-guides/source/admin/solr-search-index.rst b/doc/sphinx-guides/source/admin/solr-search-index.rst index faf1a578387..bcf55480625 100644 --- a/doc/sphinx-guides/source/admin/solr-search-index.rst +++ b/doc/sphinx-guides/source/admin/solr-search-index.rst @@ -69,7 +69,7 @@ Dataverse collections must be referenced by database object ID. If you have dire ``select id from dataverse where alias='dataversealias';`` -should work, or you may click the Dataverse Software's "Edit" menu and look for dataverseId= in the URLs produced by the drop-down. Then, to re-index: +should work, or you may click the Dataverse Software's "Edit" menu and look for *dataverseId=* in the URLs produced by the drop-down. Then, to re-index: ``curl http://localhost:8080/api/admin/index/dataverses/135`` From 9cac9fd19ac347a1f7bf97d86369521e63368762 Mon Sep 17 00:00:00 2001 From: Henning Timm Date: Tue, 18 Oct 2022 11:41:25 +0200 Subject: [PATCH 07/20] Made casing of Solr consistent across all documents --- doc/sphinx-guides/source/admin/harvestserver.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/sphinx-guides/source/admin/harvestserver.rst b/doc/sphinx-guides/source/admin/harvestserver.rst index 88004d9dc5f..6f4f23fc587 100644 --- a/doc/sphinx-guides/source/admin/harvestserver.rst +++ b/doc/sphinx-guides/source/admin/harvestserver.rst @@ -115,10 +115,10 @@ Some useful examples of search queries to define OAI sets: ``keywordValue:censorship`` -Important: New SOLR schema required! +Important: New Solr schema required! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In order to be able to define OAI sets, your SOLR server must be upgraded with the search schema that came with release 4.5 (or later), and all your local datasets must be re-indexed, once the new schema is installed. +In order to be able to define OAI sets, your Solr server must be upgraded with the search schema that came with release 4.5 (or later), and all your local datasets must be re-indexed, once the new schema is installed. OAI Set updates --------------- From c1e39ba928dcc82272f5129d15e439796358e9f9 Mon Sep 17 00:00:00 2001 From: Sherry Lake Date: Tue, 18 Oct 2022 10:21:30 -0400 Subject: [PATCH 08/20] Removed incorrect example added link The curl command in :ArchiveClassName was incorrect. Redirected to correct commands for each of the configuration to the configuration section. --- doc/sphinx-guides/source/installation/config.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/sphinx-guides/source/installation/config.rst b/doc/sphinx-guides/source/installation/config.rst index f2de9d5702f..30ec1638b9a 100644 --- a/doc/sphinx-guides/source/installation/config.rst +++ b/doc/sphinx-guides/source/installation/config.rst @@ -2714,9 +2714,9 @@ Part of the database settings to configure the BagIt file handler. This is the p ++++++++++++++++++ Your Dataverse installation can export archival "Bag" files to an extensible set of storage systems (see :ref:`BagIt Export` above for details about this and for further explanation of the other archiving related settings below). -This setting specifies which storage system to use by identifying the particular Java class that should be run. Current options include DuraCloudSubmitToArchiveCommand, LocalSubmitToArchiveCommand, and GoogleCloudSubmitToArchiveCommand. +This setting specifies which storage system to use by identifying the particular Java class that should be run. Current configuration options include DuraCloudSubmitToArchiveCommand, LocalSubmitToArchiveCommand, GoogleCloudSubmitToArchiveCommand, and S3SubmitToArchiveCommand. -``curl -X PUT -d 'LocalSubmitToArchiveCommand' http://localhost:8080/api/admin/settings/:ArchiverClassName`` +For examples, see the specific configuration above in :ref:`BagIt Export`. :ArchiverSettings +++++++++++++++++ From 2d121aa5b52abc0f55d3287176e1302387ee1bf1 Mon Sep 17 00:00:00 2001 From: Sherry Lake Date: Tue, 18 Oct 2022 11:04:41 -0400 Subject: [PATCH 09/20] clarified moving dataverse collection description --- doc/sphinx-guides/source/admin/dataverses-datasets.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx-guides/source/admin/dataverses-datasets.rst b/doc/sphinx-guides/source/admin/dataverses-datasets.rst index a961ac0b067..7f32e8c2514 100644 --- a/doc/sphinx-guides/source/admin/dataverses-datasets.rst +++ b/doc/sphinx-guides/source/admin/dataverses-datasets.rst @@ -15,7 +15,7 @@ Dataverse collections have to be empty to delete them. Navigate to the Dataverse Move a Dataverse Collection ^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Moves a Dataverse collection whose id is passed to a new Dataverse collection whose id is passed. The Dataverse collection alias also may be used instead of the id. If the moved Dataverse collection has a guestbook, template, metadata block, link, or featured Dataverse collection that is not compatible with the destination Dataverse collection, you will be informed and given the option to force the move and remove the association. Only accessible to superusers. :: +Moves a Dataverse collection whose id is passed to an existing Dataverse collection whose id is passed. The Dataverse collection alias also may be used instead of the id. If the moved Dataverse collection has a guestbook, template, metadata block, link, or featured Dataverse collection that is not compatible with the destination Dataverse collection, you will be informed and given the option to force the move and remove the association. Only accessible to superusers. :: curl -H "X-Dataverse-key: $API_TOKEN" -X POST http://$SERVER/api/dataverses/$id/move/$destination-id From 5f24e32531bf5bfa0cf9486b6a029eee02c9e58e Mon Sep 17 00:00:00 2001 From: Sherry Lake Date: Tue, 18 Oct 2022 12:35:13 -0400 Subject: [PATCH 10/20] removed bullet tabular data download all formats --- doc/sphinx-guides/source/user/dataset-management.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/sphinx-guides/source/user/dataset-management.rst b/doc/sphinx-guides/source/user/dataset-management.rst index 77a760ef838..b80d580ce35 100755 --- a/doc/sphinx-guides/source/user/dataset-management.rst +++ b/doc/sphinx-guides/source/user/dataset-management.rst @@ -193,8 +193,8 @@ Additional download options available for tabular data (found in the same drop-d - The original file uploaded by the user; - Saved as R data (if the original file was not in R format); - Variable Metadata (as a `DDI Codebook `_ XML file); -- Data File Citation (currently in either RIS, EndNote XML, or BibTeX format); -- All of the above, as a zipped bundle. +- Data File Citation (currently in either RIS, EndNote XML, or BibTeX format) + Differentially Private (DP) Metadata can also be accessed for restricted tabular files if the data depositor has created a DP Metadata Release. See :ref:`dp-release-create` for more information. From 25421a6df52381aec04658070110264b7d2d0a7d Mon Sep 17 00:00:00 2001 From: Henning Timm Date: Tue, 18 Oct 2022 21:17:44 +0200 Subject: [PATCH 11/20] Extend information on indexing times --- doc/sphinx-guides/source/admin/solr-search-index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx-guides/source/admin/solr-search-index.rst b/doc/sphinx-guides/source/admin/solr-search-index.rst index bcf55480625..0d3db3eeac6 100644 --- a/doc/sphinx-guides/source/admin/solr-search-index.rst +++ b/doc/sphinx-guides/source/admin/solr-search-index.rst @@ -36,7 +36,7 @@ Please note that the moment you issue this command, it will appear to end users Start Async Reindex ~~~~~~~~~~~~~~~~~~~ -Please note that this operation may take hours depending on the amount of data in your system. +Please note that this operation may take hours depending on the amount of data in your system and whether or not you installation is using full-text indexing. More information on this, as well as some reference times, can be found at https://github.com/IQSS/dataverse/issues/50. ``curl http://localhost:8080/api/admin/index`` From 3ba789d4caeae8b6c0f34c5780fc1e05abea3525 Mon Sep 17 00:00:00 2001 From: Henning Timm Date: Tue, 18 Oct 2022 21:27:40 +0200 Subject: [PATCH 12/20] Fixed minor typo --- doc/sphinx-guides/source/admin/solr-search-index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx-guides/source/admin/solr-search-index.rst b/doc/sphinx-guides/source/admin/solr-search-index.rst index 0d3db3eeac6..769c1ee5a0d 100644 --- a/doc/sphinx-guides/source/admin/solr-search-index.rst +++ b/doc/sphinx-guides/source/admin/solr-search-index.rst @@ -1,7 +1,7 @@ Solr Search Index ================= -A Dataverse installation requires Solr to be operational at all times. If you stop Solr, you should see a error about this on the root Dataverse installation page, which is powered by the search index Solr provides. You can set up Solr by following the steps in our Installation Guide's :doc:`/installation/prerequisites` and :doc:`/installation/config` sections explaining how to configure it. This section you're reading now is about the care and feeding of the search index. PostgreSQL is the "source of truth" and the Dataverse installation will copy data from PostgreSQL into Solr. For this reason, the search index can be rebuilt at any time. Depending on the amount of data you have, this can be a slow process. You are encouraged to experiment with production data to get a sense of how long a full reindexing will take. +A Dataverse installation requires Solr to be operational at all times. If you stop Solr, you should see an error about this on the root Dataverse installation page, which is powered by the search index Solr provides. You can set up Solr by following the steps in our Installation Guide's :doc:`/installation/prerequisites` and :doc:`/installation/config` sections explaining how to configure it. This section you're reading now is about the care and feeding of the search index. PostgreSQL is the "source of truth" and the Dataverse installation will copy data from PostgreSQL into Solr. For this reason, the search index can be rebuilt at any time. Depending on the amount of data you have, this can be a slow process. You are encouraged to experiment with production data to get a sense of how long a full reindexing will take. .. contents:: Contents: :local: From 912bd89732f7f101a21c522781b5cce30e41bfc8 Mon Sep 17 00:00:00 2001 From: Henning Timm Date: Tue, 18 Oct 2022 21:28:48 +0200 Subject: [PATCH 13/20] Extended description of full vs in-place index --- doc/sphinx-guides/source/admin/solr-search-index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx-guides/source/admin/solr-search-index.rst b/doc/sphinx-guides/source/admin/solr-search-index.rst index 769c1ee5a0d..e6f7b588ede 100644 --- a/doc/sphinx-guides/source/admin/solr-search-index.rst +++ b/doc/sphinx-guides/source/admin/solr-search-index.rst @@ -9,7 +9,7 @@ A Dataverse installation requires Solr to be operational at all times. If you st Full Reindex ------------- -There are two ways to perform a full reindex of the Dataverse installation search index. Starting with a "clear" ensures a completely clean index but involves downtime. Reindexing in place doesn't involve downtime but does not ensure a completely clean index. +There are two ways to perform a full reindex of the Dataverse installation search index. Starting with a "clear" ensures a completely clean index but involves downtime. Reindexing in place doesn't involve downtime but does not ensure a completely clean index (e.g. stale entries from destroyed datasets can remain in the index). Clear and Reindex +++++++++++++++++ From 3cdfd499eb6686260400276244e57f3b17fc464f Mon Sep 17 00:00:00 2001 From: Jim Myers Date: Wed, 19 Oct 2022 10:12:57 -0400 Subject: [PATCH 14/20] file updates - use encodingFormat and always send contentUrl if allowed FWIW: There is still a dataverse.files.hide-schema-dot-org-download-urls that, if false, will stop any contentUrls from being sent --- .../java/edu/harvard/iq/dataverse/DatasetVersion.java | 8 +++----- .../iq/dataverse/export/SchemaDotOrgExporterTest.java | 2 +- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java b/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java index 30815c43381..314e06149ee 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java @@ -2012,7 +2012,7 @@ public String getJsonLd() { } fileObject.add("@type", "DataDownload"); fileObject.add("name", fileMetadata.getLabel()); - fileObject.add("fileFormat", fileMetadata.getDataFile().getContentType()); + fileObject.add("encodingFormat", fileMetadata.getDataFile().getContentType()); fileObject.add("contentSize", fileMetadata.getDataFile().getFilesize()); fileObject.add("description", fileMetadata.getDescription()); fileObject.add("@id", filePidUrlAsString); @@ -2021,10 +2021,8 @@ public String getJsonLd() { if (hideFilesBoolean != null && hideFilesBoolean.equals("true")) { // no-op } else { - if (FileUtil.isPubliclyDownloadable(fileMetadata)) { - String nullDownloadType = null; - fileObject.add("contentUrl", dataverseSiteUrl + FileUtil.getFileDownloadUrlPath(nullDownloadType, fileMetadata.getDataFile().getId(), false, fileMetadata.getId())); - } + String nullDownloadType = null; + fileObject.add("contentUrl", dataverseSiteUrl + FileUtil.getFileDownloadUrlPath(nullDownloadType, fileMetadata.getDataFile().getId(), false, fileMetadata.getId())); } fileArray.add(fileObject); } diff --git a/src/test/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporterTest.java b/src/test/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporterTest.java index b5453e75fe5..f5bc5fd97d0 100644 --- a/src/test/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporterTest.java +++ b/src/test/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporterTest.java @@ -181,7 +181,7 @@ public void testExportDataset() throws Exception { assertEquals(2, json2.getJsonArray("spatialCoverage").size()); assertEquals("DataDownload", json2.getJsonArray("distribution").getJsonObject(0).getString("@type")); assertEquals("README.md", json2.getJsonArray("distribution").getJsonObject(0).getString("name")); - assertEquals("text/plain", json2.getJsonArray("distribution").getJsonObject(0).getString("fileFormat")); + assertEquals("text/plain", json2.getJsonArray("distribution").getJsonObject(0).getString("encodingFormat")); assertEquals(1234, json2.getJsonArray("distribution").getJsonObject(0).getInt("contentSize")); assertEquals("README file.", json2.getJsonArray("distribution").getJsonObject(0).getString("description")); assertEquals("https://doi.org/10.5072/FK2/7V5MPI", json2.getJsonArray("distribution").getJsonObject(0).getString("@id")); From cbc0e52907d23d1aaa0fbdfd270d7aa8fe6a6c00 Mon Sep 17 00:00:00 2001 From: Jim Myers Date: Wed, 19 Oct 2022 13:05:36 -0400 Subject: [PATCH 15/20] Revert "file updates - use encodingFormat and always send contentUrl if allowed" This reverts commit 3cdfd499eb6686260400276244e57f3b17fc464f. --- .../java/edu/harvard/iq/dataverse/DatasetVersion.java | 8 +++++--- .../iq/dataverse/export/SchemaDotOrgExporterTest.java | 2 +- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java b/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java index 314e06149ee..30815c43381 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java @@ -2012,7 +2012,7 @@ public String getJsonLd() { } fileObject.add("@type", "DataDownload"); fileObject.add("name", fileMetadata.getLabel()); - fileObject.add("encodingFormat", fileMetadata.getDataFile().getContentType()); + fileObject.add("fileFormat", fileMetadata.getDataFile().getContentType()); fileObject.add("contentSize", fileMetadata.getDataFile().getFilesize()); fileObject.add("description", fileMetadata.getDescription()); fileObject.add("@id", filePidUrlAsString); @@ -2021,8 +2021,10 @@ public String getJsonLd() { if (hideFilesBoolean != null && hideFilesBoolean.equals("true")) { // no-op } else { - String nullDownloadType = null; - fileObject.add("contentUrl", dataverseSiteUrl + FileUtil.getFileDownloadUrlPath(nullDownloadType, fileMetadata.getDataFile().getId(), false, fileMetadata.getId())); + if (FileUtil.isPubliclyDownloadable(fileMetadata)) { + String nullDownloadType = null; + fileObject.add("contentUrl", dataverseSiteUrl + FileUtil.getFileDownloadUrlPath(nullDownloadType, fileMetadata.getDataFile().getId(), false, fileMetadata.getId())); + } } fileArray.add(fileObject); } diff --git a/src/test/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporterTest.java b/src/test/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporterTest.java index f5bc5fd97d0..b5453e75fe5 100644 --- a/src/test/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporterTest.java +++ b/src/test/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporterTest.java @@ -181,7 +181,7 @@ public void testExportDataset() throws Exception { assertEquals(2, json2.getJsonArray("spatialCoverage").size()); assertEquals("DataDownload", json2.getJsonArray("distribution").getJsonObject(0).getString("@type")); assertEquals("README.md", json2.getJsonArray("distribution").getJsonObject(0).getString("name")); - assertEquals("text/plain", json2.getJsonArray("distribution").getJsonObject(0).getString("encodingFormat")); + assertEquals("text/plain", json2.getJsonArray("distribution").getJsonObject(0).getString("fileFormat")); assertEquals(1234, json2.getJsonArray("distribution").getJsonObject(0).getInt("contentSize")); assertEquals("README file.", json2.getJsonArray("distribution").getJsonObject(0).getString("description")); assertEquals("https://doi.org/10.5072/FK2/7V5MPI", json2.getJsonArray("distribution").getJsonObject(0).getString("@id")); From 08fea85853591ea616a7b0117d5390e012f217b9 Mon Sep 17 00:00:00 2001 From: Sherry Lake Date: Thu, 20 Oct 2022 08:00:57 -0400 Subject: [PATCH 16/20] Added period to end of bullet list --- doc/sphinx-guides/source/user/dataset-management.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx-guides/source/user/dataset-management.rst b/doc/sphinx-guides/source/user/dataset-management.rst index b80d580ce35..ec3bb392ce5 100755 --- a/doc/sphinx-guides/source/user/dataset-management.rst +++ b/doc/sphinx-guides/source/user/dataset-management.rst @@ -193,7 +193,7 @@ Additional download options available for tabular data (found in the same drop-d - The original file uploaded by the user; - Saved as R data (if the original file was not in R format); - Variable Metadata (as a `DDI Codebook `_ XML file); -- Data File Citation (currently in either RIS, EndNote XML, or BibTeX format) +- Data File Citation (currently in either RIS, EndNote XML, or BibTeX format). Differentially Private (DP) Metadata can also be accessed for restricted tabular files if the data depositor has created a DP Metadata Release. See :ref:`dp-release-create` for more information. From cb0075e555b1399835a809c5ef160d53c86c34bc Mon Sep 17 00:00:00 2001 From: Sherry Lake Date: Fri, 21 Oct 2022 14:26:35 -0400 Subject: [PATCH 17/20] corrected BagIT API Call --- doc/sphinx-guides/source/installation/config.rst | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/doc/sphinx-guides/source/installation/config.rst b/doc/sphinx-guides/source/installation/config.rst index f2de9d5702f..21ae85b8435 100644 --- a/doc/sphinx-guides/source/installation/config.rst +++ b/doc/sphinx-guides/source/installation/config.rst @@ -1240,14 +1240,19 @@ API Calls Once this configuration is complete, you, as a user with the *PublishDataset* permission, should be able to use the admin API call to manually submit a DatasetVersion for processing: -``curl -X POST -H "X-Dataverse-key: " http://localhost:8080/api/admin/submitDatasetVersionToArchive/{id}/{version}`` +``curl -X POST -H "X-Dataverse-key: " http://localhost:8080/api/admin/submitDatasetVersionToArchive/{version}/{id}`` where: -``{id}`` is the DatasetId (or ``:persistentId`` with the ``?persistentId=""`` parameter), and +``{id}`` is the DatasetId, and ``{version}`` is the friendly version number, e.g. "1.2". +or in place of the DatasetID, you can use ``:persistentId`` with the ``?persistentId=""``: + +``curl -X POST -H "X-Dataverse-key: " http://localhost:8080/api/admin/submitDatasetVersionToArchive/:persistentId/{version}?persistentId=""`` + + The submitDatasetVersionToArchive API (and the workflow discussed below) attempt to archive the dataset version via an archive specific method. For Chronopolis, a DuraCloud space named for the dataset (it's DOI with ':' and '.' replaced with '-') is created and two files are uploaded to it: a version-specific datacite.xml metadata file and a BagIt bag containing the data and an OAI-ORE map file. (The datacite.xml file, stored outside the Bag as well as inside is intended to aid in discovery while the ORE map file is 'complete', containing all user-entered metadata and is intended as an archival record.) In the Chronopolis case, since the transfer from the DuraCloud front-end to archival storage in Chronopolis can take significant time, it is currently up to the admin/curator to submit a 'snap-shot' of the space within DuraCloud and to monitor its successful transfer. Once transfer is complete the space should be deleted, at which point the Dataverse Software API call can be used to submit a Bag for other versions of the same Dataset. (The space is reused, so that archival copies of different Dataset versions correspond to different snapshots of the same DuraCloud space.). @@ -1256,9 +1261,10 @@ A batch version of this admin api call is also available: ``curl -X POST -H "X-Dataverse-key: " 'http://localhost:8080/api/admin/archiveAllUnarchivedDatasetVersions?listonly=true&limit=10&latestonly=true'`` -The archiveAllUnarchivedDatasetVersions call takes 3 optional configuration parameters. +The archiveAllUnarchivedDatasetVersions call takes 3 optional configuration parameters. + * listonly=true will cause the API to list dataset versions that would be archived but will not take any action. -* limit= will limit the number of dataset versions archived in one api call to <= . +* limit= will limit the number of dataset versions archived in one api call to ``<=`` . * latestonly=true will limit archiving to only the latest published versions of datasets instead of archiving all unarchived versions. Note that because archiving is done asynchronously, the calls above will return OK even if the user does not have the *PublishDataset* permission on the dataset(s) involved. Failures are indocated in the log and the archivalStatus calls in the native api can be used to check the status as well. From 648412e573726c4dc9f29c2bbbed472ec528c2f7 Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Thu, 27 Oct 2022 10:39:14 -0400 Subject: [PATCH 18/20] revert swap of `id` and `version` in submitDatasetVersionToArchive #9093 --- doc/sphinx-guides/source/installation/config.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx-guides/source/installation/config.rst b/doc/sphinx-guides/source/installation/config.rst index 866013db7b0..47fd92d8366 100644 --- a/doc/sphinx-guides/source/installation/config.rst +++ b/doc/sphinx-guides/source/installation/config.rst @@ -1240,7 +1240,7 @@ API Calls Once this configuration is complete, you, as a user with the *PublishDataset* permission, should be able to use the admin API call to manually submit a DatasetVersion for processing: -``curl -X POST -H "X-Dataverse-key: " http://localhost:8080/api/admin/submitDatasetVersionToArchive/{version}/{id}`` +``curl -X POST -H "X-Dataverse-key: " http://localhost:8080/api/admin/submitDatasetVersionToArchive/{id}/{version}`` where: From 2475bd8d099c2626ec166d1fe6a1abb9b802e3ca Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Thu, 27 Oct 2022 10:55:07 -0400 Subject: [PATCH 19/20] make a few more tweaks and fixed typos #9090 --- .../source/installation/config.rst | 20 +++++++++---------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/doc/sphinx-guides/source/installation/config.rst b/doc/sphinx-guides/source/installation/config.rst index 47fd92d8366..bfcbd3d6325 100644 --- a/doc/sphinx-guides/source/installation/config.rst +++ b/doc/sphinx-guides/source/installation/config.rst @@ -1235,8 +1235,8 @@ The :S3ArchiverConfig setting is a JSON object that must include an "s3_bucket_n .. _Archiving API Call: -API Calls -+++++++++ +BagIt Export API Calls +++++++++++++++++++++++ Once this configuration is complete, you, as a user with the *PublishDataset* permission, should be able to use the admin API call to manually submit a DatasetVersion for processing: @@ -1244,31 +1244,29 @@ Once this configuration is complete, you, as a user with the *PublishDataset* pe where: -``{id}`` is the DatasetId, and +``{id}`` is the DatasetId (the database id of the dataset) and ``{version}`` is the friendly version number, e.g. "1.2". -or in place of the DatasetID, you can use ``:persistentId`` with the ``?persistentId=""``: +or in place of the DatasetID, you can use the string ``:persistentId`` as the ``{id}`` and add the DOI/PID as a query parameter like this: ``?persistentId=""``. Here is how the full command would look: ``curl -X POST -H "X-Dataverse-key: " http://localhost:8080/api/admin/submitDatasetVersionToArchive/:persistentId/{version}?persistentId=""`` +The submitDatasetVersionToArchive API (and the workflow discussed below) attempt to archive the dataset version via an archive specific method. For Chronopolis, a DuraCloud space named for the dataset (its DOI with ":" and "." replaced with "-", e.g. ``doi-10-5072-fk2-tgbhlb``) is created and two files are uploaded to it: a version-specific datacite.xml metadata file and a BagIt bag containing the data and an OAI-ORE map file. (The datacite.xml file, stored outside the Bag as well as inside, is intended to aid in discovery while the ORE map file is "complete", containing all user-entered metadata and is intended as an archival record.) -The submitDatasetVersionToArchive API (and the workflow discussed below) attempt to archive the dataset version via an archive specific method. For Chronopolis, a DuraCloud space named for the dataset (it's DOI with ':' and '.' replaced with '-') is created and two files are uploaded to it: a version-specific datacite.xml metadata file and a BagIt bag containing the data and an OAI-ORE map file. (The datacite.xml file, stored outside the Bag as well as inside is intended to aid in discovery while the ORE map file is 'complete', containing all user-entered metadata and is intended as an archival record.) +In the Chronopolis case, since the transfer from the DuraCloud front-end to archival storage in Chronopolis can take significant time, it is currently up to the admin/curator to submit a 'snap-shot' of the space within DuraCloud and to monitor its successful transfer. Once transfer is complete the space should be deleted, at which point the Dataverse Software API call can be used to submit a Bag for other versions of the same dataset. (The space is reused, so that archival copies of different dataset versions correspond to different snapshots of the same DuraCloud space.). -In the Chronopolis case, since the transfer from the DuraCloud front-end to archival storage in Chronopolis can take significant time, it is currently up to the admin/curator to submit a 'snap-shot' of the space within DuraCloud and to monitor its successful transfer. Once transfer is complete the space should be deleted, at which point the Dataverse Software API call can be used to submit a Bag for other versions of the same Dataset. (The space is reused, so that archival copies of different Dataset versions correspond to different snapshots of the same DuraCloud space.). - -A batch version of this admin api call is also available: +A batch version of this admin API call is also available: ``curl -X POST -H "X-Dataverse-key: " 'http://localhost:8080/api/admin/archiveAllUnarchivedDatasetVersions?listonly=true&limit=10&latestonly=true'`` The archiveAllUnarchivedDatasetVersions call takes 3 optional configuration parameters. * listonly=true will cause the API to list dataset versions that would be archived but will not take any action. -* limit= will limit the number of dataset versions archived in one api call to ``<=`` . +* limit= will limit the number of dataset versions archived in one API call to ``<=`` . * latestonly=true will limit archiving to only the latest published versions of datasets instead of archiving all unarchived versions. -Note that because archiving is done asynchronously, the calls above will return OK even if the user does not have the *PublishDataset* permission on the dataset(s) involved. Failures are indocated in the log and the archivalStatus calls in the native api can be used to check the status as well. - +Note that because archiving is done asynchronously, the calls above will return OK even if the user does not have the *PublishDataset* permission on the dataset(s) involved. Failures are indicated in the log and the archivalStatus calls in the native API can be used to check the status as well. PostPublication Workflow ++++++++++++++++++++++++ From 48239b81e8ca3538a74ccc0d9efcdf550340f13c Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Thu, 27 Oct 2022 16:37:59 -0400 Subject: [PATCH 20/20] reorder client libraries, other tweaks #9070 --- .../source/api/client-libraries.rst | 44 ++++++++++--------- 1 file changed, 24 insertions(+), 20 deletions(-) diff --git a/doc/sphinx-guides/source/api/client-libraries.rst b/doc/sphinx-guides/source/api/client-libraries.rst index 9d653c549a6..bf9f658808b 100755 --- a/doc/sphinx-guides/source/api/client-libraries.rst +++ b/doc/sphinx-guides/source/api/client-libraries.rst @@ -1,48 +1,59 @@ Client Libraries ================ -Currently there are client libraries for Python, Javascript, R, Java, and Julia that can be used to develop against Dataverse Software APIs. We use the term "client library" on this page but "Dataverse Software SDK" (software development kit) is another way of describing these resources. They are designed to help developers express Dataverse Software concepts more easily in the languages listed below. For support on any of these client libraries, please consult each project's README. +Listed below are a variety of clienty libraries to help you interact with Dataverse APIs from Python, R, Javascript, etc. -Because a Dataverse installation is a SWORD server, additional client libraries exist for Java, Ruby, and PHP per the :doc:`/api/sword` page. +To get support for any of these client libraries, please consult each project's README. .. contents:: |toctitle| :local: -C / C++ -------- -A C / C++ library to expedite deployment when connecting to a Dataverse API can be found here: -[https://github.com/aeonSolutions/OpenScience-Dataverse-API-C-library](https://github.com/aeonSolutions/OpenScience-Dataverse-API-C-library) +C/C++ +----- -This C/C++ library was initialy coded and is currently maintained by [Miguel T.](https://www.linkedin.com/in/migueltomas/). A features common HTTPS GET and POST requests made to the API in a dataverse. To leanr how to install it and use it, goto the wiki page [here](https://github.com/aeonSolutions/OpenScience-Dataverse-API-C-library/wiki). +https://github.com/aeonSolutions/OpenScience-Dataverse-API-C-library is the official C/C++ library for Dataverse APIs. +This C/C++ library was created and is currently maintained by `Miguel T. `_ To learn how to install and use it, see the project's `wiki page `_. Java ---- -https://github.com/IQSS/dataverse-client-java is the official Java library for Dataverse Software APIs. +https://github.com/IQSS/dataverse-client-java is the official Java library for Dataverse APIs. `Richard Adams `_ from `ResearchSpace `_ created and maintains this library. Javascript ---------- -https://github.com/IQSS/dataverse-client-javascript is the official Javascript package for Dataverse Software APIs. It can be found on npm at https://www.npmjs.com/package/js-dataverse +https://github.com/IQSS/dataverse-client-javascript is the official Javascript package for Dataverse APIs. It can be found on npm at https://www.npmjs.com/package/js-dataverse It was created and is maintained by `The Agile Monkeys `_. +Julia +----- + +https://github.com/gaelforget/Dataverse.jl is the official Julia package for Dataverse APIs. It can be found on JuliaHub (https://juliahub.com/ui/Packages/Dataverse/xWAqY/) and leverages pyDataverse to provide an interface to Dataverse's data access API and native API. Dataverse.jl provides a few additional functionalities with documentation (https://gaelforget.github.io/Dataverse.jl/dev/) and a demo notebook (https://gaelforget.github.io/Dataverse.jl/dev/notebook.html). + +It was created and is maintained by `Gael Forget `_. + +PHP +--- + +There is no official PHP library for Dataverse APIs (please :ref:`get in touch ` if you'd like to create one!) but there is a SWORD library written in PHP listed under :ref:`client-libraries` in the :doc:`/api/sword` documentation. + Python ------ -There are two Python modules for interacting with Dataverse Software APIs. +There are two Python modules for interacting with Dataverse APIs. `pyDataverse `_ primarily allows developers to manage Dataverse collections, datasets and datafiles. Its intention is to help with data migrations and DevOps activities such as testing and configuration management. The module is developed by `Stefan Kasberger `_ from `AUSSDA - The Austrian Social Science Data Archive `_. -`dataverse-client-python `_ had its initial release in 2015. `Robert Liebowitz `_ created this library while at the `Center for Open Science (COS) `_ and the COS uses it to integrate the `Open Science Framework (OSF) `_ with a Dataverse installation via an add-on which itself is open source and listed on the :doc:`/api/apps` page. +`dataverse-client-python `_ had its initial release in 2015. `Robert Liebowitz `_ created this library while at the `Center for Open Science (COS) `_ and the COS uses it to integrate the `Open Science Framework (OSF) `_ with Dataverse installations via an add-on which itself is open source and listed on the :doc:`/api/apps` page. R - -https://github.com/IQSS/dataverse-client-r is the official R package for Dataverse Software APIs. The latest release can be installed from `CRAN `_. +https://github.com/IQSS/dataverse-client-r is the official R package for Dataverse APIs. The latest release can be installed from `CRAN `_. The R client can search and download datasets. It is useful when automatically (instead of manually) downloading data files as part of a script. For bulk edit and upload operations, we currently recommend pyDataverse. The package is currently maintained by `Shiro Kuriwaki `_. It was originally created by `Thomas Leeper `_ and then formerly maintained by `Will Beasley `_. @@ -51,13 +62,6 @@ The package is currently maintained by `Shiro Kuriwaki `_.