Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset files API extension for file display data with pagination and sorting #9693

Merged
merged 45 commits into from
Sep 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
73c3235
Stash: getVersionFiles API extension with pagination and order criter…
GPortas Jul 4, 2023
dc31788
Stash: IT tests for getVersionFiles API endpoint WIP
GPortas Jul 4, 2023
fff2119
Added: IT tests for getVersionFiles API endpoint
GPortas Jul 4, 2023
e8951a4
Removed: unused imports
GPortas Jul 5, 2023
6264fa4
Added: publication date field to data file payload
GPortas Jul 5, 2023
639cff8
Added: getCountGuestbookResponsesByDataFileId API endpoint
GPortas Jul 6, 2023
886a508
Added: canDownloadFile method to FileDownloadServiceBean
GPortas Jul 6, 2023
6ead834
Added: canDataFileBeDownloaded API endpoint
GPortas Jul 7, 2023
9f35bf7
Added: naming refactor and managing not found files in new files API …
GPortas Jul 9, 2023
a2bc4d4
Removed: not essential findDataFileOrDie call to avoid extra query
GPortas Jul 9, 2023
ec75534
Added: getFileThumbnailClass API endpoint and enhanced test coverage …
GPortas Jul 10, 2023
86865f5
Added: getCountGuestbookResponses PIDs support and param format and d…
GPortas Jul 10, 2023
31c306d
Merge branch 'develop' of github.com:IQSS/dataverse into 9692-files-a…
GPortas Jul 17, 2023
489b990
Added: dataTables field to DataFile JSON API payload
GPortas Jul 17, 2023
8c2781b
Added: release notes for 9692
GPortas Jul 17, 2023
867fb8a
Added: endpoint for getting file data tables and missing authenticati…
GPortas Jul 18, 2023
f2b374e
Added: new endpoint to the release notes
GPortas Jul 18, 2023
6437e5d
Merge branch 'develop' of github.com:IQSS/dataverse into 9692-files-a…
GPortas Jul 18, 2023
2cb77d4
Changed: more realistic test tab file content
GPortas Jul 25, 2023
c86a1d7
Refactor: file metadatas query formatting
GPortas Jul 25, 2023
742036a
Merge branch 'develop' of github.com:IQSS/dataverse into 9692-files-a…
GPortas Jul 25, 2023
991641b
Refactor: using TypedQuery instead of Query for getFileMetadatas
GPortas Jul 25, 2023
1ff9d90
Refactor: getVersionFiles endpoint invalid orderCriteria error handling
GPortas Jul 27, 2023
a8a367a
Merge branch 'develop' of github.com:IQSS/dataverse into 9692-files-a…
GPortas Aug 1, 2023
7a9c2b3
Removed: canFileBeDownloaded endpoint and related logic
GPortas Aug 1, 2023
b52b33d
Added: getUserPermissionsOnFile endpoint to Access API
GPortas Aug 2, 2023
d16264d
Removed: getFileThumbnailClass API endpoint
GPortas Aug 4, 2023
0952a62
Changed: release notes
GPortas Aug 4, 2023
02cecd4
Added: docs for files endpoint pagination and order criteria
GPortas Aug 4, 2023
6722777
Added: docs for file dataTables API endpoint
GPortas Aug 4, 2023
90ef2f1
Added: docs for guestbookResponses/count files API endpoint
GPortas Aug 4, 2023
a7b2485
Changed: version files endpoint docs tweak
GPortas Aug 4, 2023
cd8e229
wording #9692
pdurbin Aug 4, 2023
a64199f
Added: API docs for access file user permissions
GPortas Aug 9, 2023
24944eb
Merge branch '9692-files-api-extension-display-data' of github.com:IQ…
GPortas Aug 9, 2023
719fc67
Changed: guestbookResponses/count endpoint renamed to downloadCount
GPortas Aug 9, 2023
1224311
doc tweaks #9692
pdurbin Aug 9, 2023
0c0ddae
Merge branch 'develop' into 9692-files-api-extension-display-data #9692
pdurbin Aug 9, 2023
4b276db
Merge branch 'develop' of github.com:IQSS/dataverse into 9692-files-a…
GPortas Aug 16, 2023
6b59a5b
Merge branch '9692-files-api-extension-display-data' of github.com:IQ…
GPortas Aug 16, 2023
35e4547
Fixed: missing import removed by mistake after develop merge
GPortas Aug 16, 2023
6b45d82
Merge branch 'develop' of github.com:IQSS/dataverse into 9692-files-a…
GPortas Aug 24, 2023
0df8ca7
Merge branch 'develop' of github.com:IQSS/dataverse into 9692-files-a…
GPortas Sep 8, 2023
6b19648
Merge branch 'develop' of github.com:IQSS/dataverse into 9692-files-a…
GPortas Sep 12, 2023
2e2fb38
Added: getFileDataTables endpoint permission checks for restricted an…
GPortas Sep 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions doc/release-notes/9692-files-api-extension.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
The following API endpoints have been added:

- /api/files/{id}/downloadCount
- /api/files/{id}/dataTables
- /access/datafile/{id}/userPermissions

The getVersionFiles endpoint (/api/datasets/{id}/versions/{versionId}/files) has been extended to support pagination and ordering
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This addition should be documented in the API guide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is not done, sorry: 😄

  • /access/datafile/{id}/userPermissions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added now, sorry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looks good, thanks!

16 changes: 16 additions & 0 deletions doc/sphinx-guides/source/api/dataaccess.rst
Original file line number Diff line number Diff line change
Expand Up @@ -403,3 +403,19 @@ This method returns a list of Authenticated Users who have requested access to t
A curl example using an ``id``::

curl -H "X-Dataverse-key:$API_TOKEN" -X GET http://$SERVER/api/access/datafile/{id}/listRequests

Get User Permissions on a File:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``/api/access/datafile/{id}/userPermissions``

This method returns the permissions that the calling user has on a particular file.

In particular, the user permissions that this method checks, returned as booleans, are the following:

* Can download the file
* Can edit the file owner dataset

A curl example using an ``id``::

curl -H "X-Dataverse-key:$API_TOKEN" -X GET "http://$SERVER/api/access/datafile/{id}/userPermissions"
7 changes: 7 additions & 0 deletions doc/sphinx-guides/source/api/metrics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,10 @@ The following table lists the available metrics endpoints (not including the Mak
/api/info/metrics/uniquefiledownloads/toMonth/{yyyy-MM},"count by id, pid","json, csv",collection subtree,published,y,cumulative up to month specified,unique download counts per file id to the specified month. PIDs are also included in output if they exist
/api/info/metrics/tree,"id, ownerId, alias, depth, name, children",json,collection subtree,published,y,"tree of dataverses starting at the root or a specified parentAlias with their id, owner id, alias, name, a computed depth, and array of children dataverses","underlying code can also include draft dataverses, this is not currently accessible via api, depth starts at 0"
/api/info/metrics/tree/toMonth/{yyyy-MM},"id, ownerId, alias, depth, name, children",json,collection subtree,published,y,"tree of dataverses in existence as of specified date starting at the root or a specified parentAlias with their id, owner id, alias, name, a computed depth, and array of children dataverses","underlying code can also include draft dataverses, this is not currently accessible via api, depth starts at 0"

Related API Endpoints
---------------------

The following endpoints are not under the metrics namespace but also return counts:

- :ref:`file-download-count`
102 changes: 102 additions & 0 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -958,6 +958,29 @@ The fully expanded example above (without environment variables) looks like this

curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files"

This endpoint supports optional pagination, through the ``limit`` and ``offset`` query params:

.. code-block:: bash

curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files?limit=10&offset=20"

Ordering criteria for sorting the results is also optionally supported. In particular, by the following possible values:

* ``NameAZ`` (Default)
* ``NameZA``
* ``Newest``
* ``Oldest``
* ``Size``
* ``Type``

Please note that these values are case sensitive and must be correctly typed for the endpoint to recognize them.

Usage example:

.. code-block:: bash

curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files?orderCriteria=Newest"

View Dataset Files and Folders as a Directory Index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -2702,6 +2725,85 @@ The fully expanded example above (without environment variables) looks like this

Note: The ``id`` returned in the json response is the id of the file metadata version.

Getting File Data Tables
~~~~~~~~~~~~~~~~~~~~~~~~

This endpoint is oriented toward tabular files and provides a JSON representation of the file data tables for an existing tabular file. ``ID`` is the database id of the file to get the data tables from or ``PERSISTENT_ID`` is the persistent id (DOI or Handle) of the file.

A curl example using an ``ID``

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=24

curl $SERVER_URL/api/files/$ID/dataTables

The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash

curl https://demo.dataverse.org/api/files/24/dataTables

A curl example using a ``PERSISTENT_ID``

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_ID=doi:10.5072/FK2/AAA000

curl "$SERVER_URL/api/files/:persistentId/dataTables?persistentId=$PERSISTENT_ID"

The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash

curl "https://demo.dataverse.org/api/files/:persistentId/dataTables?persistentId=doi:10.5072/FK2/AAA000"

Note that if the requested file is not tabular, the endpoint will return an error.

.. _file-download-count:

Getting File Download Count
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Provides the download count for a particular file, where ``ID`` is the database id of the file to get the download count from or ``PERSISTENT_ID`` is the persistent id (DOI or Handle) of the file.

A curl example using an ``ID``

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=24

curl "$SERVER_URL/api/files/$ID/downloadCount"

The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash

curl "https://demo.dataverse.org/api/files/24/downloadCount"

A curl example using a ``PERSISTENT_ID``

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_ID=doi:10.5072/FK2/AAA000

curl "$SERVER_URL/api/files/:persistentId/downloadCount?persistentId=$PERSISTENT_ID"

The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash

curl "https://demo.dataverse.org/api/files/:persistentId/downloadCount?persistentId=doi:10.5072/FK2/AAA000"

If you are interested in download counts for multiple files, see :doc:`/api/metrics`.

Updating File Metadata
~~~~~~~~~~~~~~~~~~~~~~
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,23 @@ public class DatasetVersionServiceBean implements java.io.Serializable {
private static final Logger logger = Logger.getLogger(DatasetVersionServiceBean.class.getCanonicalName());

private static final SimpleDateFormat logFormatter = new SimpleDateFormat("yyyy-MM-dd'T'HH-mm-ss");


private static final String QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_LABEL = "SELECT fm FROM FileMetadata fm"
+ " WHERE fm.datasetVersion.id=:datasetVersionId"
+ " ORDER BY fm.label";
private static final String QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_DATE = "SELECT fm FROM FileMetadata fm, DvObject dvo"
+ " WHERE fm.datasetVersion.id = :datasetVersionId"
+ " AND fm.dataFile.id = dvo.id"
+ " ORDER BY CASE WHEN dvo.publicationDate IS NOT NULL THEN dvo.publicationDate ELSE dvo.createDate END";
private static final String QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_SIZE = "SELECT fm FROM FileMetadata fm, DataFile df"
+ " WHERE fm.datasetVersion.id = :datasetVersionId"
+ " AND fm.dataFile.id = df.id"
+ " ORDER BY df.filesize";
private static final String QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_TYPE = "SELECT fm FROM FileMetadata fm, DataFile df"
+ " WHERE fm.datasetVersion.id = :datasetVersionId"
+ " AND fm.dataFile.id = df.id"
+ " ORDER BY df.contentType";

@EJB
DatasetServiceBean datasetService;

Expand Down Expand Up @@ -149,7 +165,19 @@ public DatasetVersion getDatasetVersion(){
return this.datasetVersionForResponse;
}
} // end RetrieveDatasetVersionResponse


/**
* Different criteria to sort the results of FileMetadata queries used in {@link DatasetVersionServiceBean#getFileMetadatas}
*/
public enum FileMetadatasOrderCriteria {
NameAZ,
NameZA,
Newest,
Oldest,
Size,
Type
}

public DatasetVersion find(Object pk) {
return em.find(DatasetVersion.class, pk);
}
Expand Down Expand Up @@ -1224,4 +1252,50 @@ public List<DatasetVersion> getUnarchivedDatasetVersions(){
return null;
}
} // end getUnarchivedDatasetVersions

/**
* Returns a FileMetadata list of files in the specified DatasetVersion
*
* @param datasetVersion the DatasetVersion to access
* @param limit for pagination, can be null
* @param offset for pagination, can be null
* @param orderCriteria a FileMetadatasOrderCriteria to order the results
* @return a FileMetadata list of the specified DatasetVersion
*/
public List<FileMetadata> getFileMetadatas(DatasetVersion datasetVersion, Integer limit, Integer offset, FileMetadatasOrderCriteria orderCriteria) {
TypedQuery<FileMetadata> query = em.createQuery(getQueryStringFromFileMetadatasOrderCriteria(orderCriteria), FileMetadata.class)
.setParameter("datasetVersionId", datasetVersion.getId());
if (limit != null) {
query.setMaxResults(limit);
}
if (offset != null) {
query.setFirstResult(offset);
}
return query.getResultList();
}

private String getQueryStringFromFileMetadatasOrderCriteria(FileMetadatasOrderCriteria orderCriteria) {
String queryString;
switch (orderCriteria) {
case NameZA:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_LABEL + " DESC";
break;
case Newest:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_DATE + " DESC";
break;
case Oldest:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_DATE;
break;
case Size:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_SIZE;
break;
case Type:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_TYPE;
break;
default:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_LABEL;
break;
}
return queryString;
}
} // end class
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import edu.harvard.iq.dataverse.authorization.users.User;
import edu.harvard.iq.dataverse.dataaccess.DataAccess;
import edu.harvard.iq.dataverse.dataaccess.StorageIO;
import edu.harvard.iq.dataverse.engine.command.DataverseRequest;
import edu.harvard.iq.dataverse.engine.command.exception.CommandException;
import edu.harvard.iq.dataverse.engine.command.impl.CreateGuestbookResponseCommand;
import edu.harvard.iq.dataverse.engine.command.impl.RequestAccessCommand;
Expand Down Expand Up @@ -571,5 +572,15 @@ public String getDirectStorageLocatrion(String storageLocation) {

return null;
}


/**
* Checks if the DataverseRequest, which contains IP Groups, has permission to download the file
*
* @param dataverseRequest the DataverseRequest
* @param dataFile the DataFile to check permissions
* @return boolean
*/
public boolean canDownloadFile(DataverseRequest dataverseRequest, DataFile dataFile) {
return permissionService.requestOn(dataverseRequest, dataFile).has(Permission.DownloadFile);
}
}
19 changes: 18 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/api/Access.java
Original file line number Diff line number Diff line change
Expand Up @@ -1945,5 +1945,22 @@ private URI handleCustomZipDownload(User user, String customZipServiceUrl, Strin
throw new BadRequestException();
}
return redirectUri;
}
}

@GET
@AuthRequired
@Path("/datafile/{id}/userPermissions")
public Response getUserPermissionsOnFile(@Context ContainerRequestContext crc, @PathParam("id") String dataFileId) {
DataFile dataFile;
try {
dataFile = findDataFileOrDie(dataFileId);
} catch (WrappedResponse wr) {
return wr.getResponse();
}
JsonObjectBuilder jsonObjectBuilder = Json.createObjectBuilder();
User requestUser = getRequestUser(crc);
jsonObjectBuilder.add("canDownloadFile", fileDownloadService.canDownloadFile(createDataverseRequest(requestUser), dataFile));
jsonObjectBuilder.add("canEditOwnerDataset", permissionService.userOn(requestUser, dataFile.getOwner()).has(Permission.EditDataset));
return ok(jsonObjectBuilder);
}
}
14 changes: 11 additions & 3 deletions src/main/java/edu/harvard/iq/dataverse/api/Datasets.java
Original file line number Diff line number Diff line change
Expand Up @@ -488,9 +488,17 @@ public Response getVersion(@Context ContainerRequestContext crc, @PathParam("id"
@GET
@AuthRequired
@Path("{id}/versions/{versionId}/files")
public Response getVersionFiles(@Context ContainerRequestContext crc, @PathParam("id") String datasetId, @PathParam("versionId") String versionId, @Context UriInfo uriInfo, @Context HttpHeaders headers) {
return response( req -> ok( jsonFileMetadatas(
getDatasetVersionOrDie(req, versionId, findDatasetOrDie(datasetId), uriInfo, headers).getFileMetadatas())), getRequestUser(crc));
public Response getVersionFiles(@Context ContainerRequestContext crc, @PathParam("id") String datasetId, @PathParam("versionId") String versionId, @QueryParam("limit") Integer limit, @QueryParam("offset") Integer offset, @QueryParam("orderCriteria") String orderCriteria, @Context UriInfo uriInfo, @Context HttpHeaders headers) {
return response( req -> {
DatasetVersion datasetVersion = getDatasetVersionOrDie(req, versionId, findDatasetOrDie(datasetId), uriInfo, headers);
DatasetVersionServiceBean.FileMetadatasOrderCriteria fileMetadatasOrderCriteria;
try {
fileMetadatasOrderCriteria = orderCriteria != null ? DatasetVersionServiceBean.FileMetadatasOrderCriteria.valueOf(orderCriteria) : DatasetVersionServiceBean.FileMetadatasOrderCriteria.NameAZ;
} catch (IllegalArgumentException e) {
return error(Response.Status.BAD_REQUEST, "Invalid order criteria: " + orderCriteria);
}
pdurbin marked this conversation as resolved.
Show resolved Hide resolved
return ok(jsonFileMetadatas(datasetversionService.getFileMetadatas(datasetVersion, limit, offset, fileMetadatasOrderCriteria)));
}, getRequestUser(crc));
}

@GET
Expand Down
Loading
Loading