Skip to content

Commit

Permalink
Merge pull request #9693 from IQSS/9692-files-api-extension-display-data
Browse files Browse the repository at this point in the history
Dataset files API extension for file display data with pagination and sorting
  • Loading branch information
kcondon authored Sep 21, 2023
2 parents 43220b4 + 2e2fb38 commit 3b7d824
Show file tree
Hide file tree
Showing 15 changed files with 619 additions and 22 deletions.
7 changes: 7 additions & 0 deletions doc/release-notes/9692-files-api-extension.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
The following API endpoints have been added:

- /api/files/{id}/downloadCount
- /api/files/{id}/dataTables
- /access/datafile/{id}/userPermissions

The getVersionFiles endpoint (/api/datasets/{id}/versions/{versionId}/files) has been extended to support pagination and ordering
16 changes: 16 additions & 0 deletions doc/sphinx-guides/source/api/dataaccess.rst
Original file line number Diff line number Diff line change
Expand Up @@ -403,3 +403,19 @@ This method returns a list of Authenticated Users who have requested access to t
A curl example using an ``id``::

curl -H "X-Dataverse-key:$API_TOKEN" -X GET http://$SERVER/api/access/datafile/{id}/listRequests

Get User Permissions on a File:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``/api/access/datafile/{id}/userPermissions``

This method returns the permissions that the calling user has on a particular file.

In particular, the user permissions that this method checks, returned as booleans, are the following:

* Can download the file
* Can edit the file owner dataset

A curl example using an ``id``::

curl -H "X-Dataverse-key:$API_TOKEN" -X GET "http://$SERVER/api/access/datafile/{id}/userPermissions"
7 changes: 7 additions & 0 deletions doc/sphinx-guides/source/api/metrics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,10 @@ The following table lists the available metrics endpoints (not including the Mak
/api/info/metrics/uniquefiledownloads/toMonth/{yyyy-MM},"count by id, pid","json, csv",collection subtree,published,y,cumulative up to month specified,unique download counts per file id to the specified month. PIDs are also included in output if they exist
/api/info/metrics/tree,"id, ownerId, alias, depth, name, children",json,collection subtree,published,y,"tree of dataverses starting at the root or a specified parentAlias with their id, owner id, alias, name, a computed depth, and array of children dataverses","underlying code can also include draft dataverses, this is not currently accessible via api, depth starts at 0"
/api/info/metrics/tree/toMonth/{yyyy-MM},"id, ownerId, alias, depth, name, children",json,collection subtree,published,y,"tree of dataverses in existence as of specified date starting at the root or a specified parentAlias with their id, owner id, alias, name, a computed depth, and array of children dataverses","underlying code can also include draft dataverses, this is not currently accessible via api, depth starts at 0"

Related API Endpoints
---------------------

The following endpoints are not under the metrics namespace but also return counts:

- :ref:`file-download-count`
102 changes: 102 additions & 0 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -958,6 +958,29 @@ The fully expanded example above (without environment variables) looks like this
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files"
This endpoint supports optional pagination, through the ``limit`` and ``offset`` query params:

.. code-block:: bash
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files?limit=10&offset=20"
Ordering criteria for sorting the results is also optionally supported. In particular, by the following possible values:

* ``NameAZ`` (Default)
* ``NameZA``
* ``Newest``
* ``Oldest``
* ``Size``
* ``Type``

Please note that these values are case sensitive and must be correctly typed for the endpoint to recognize them.

Usage example:

.. code-block:: bash
curl "https://demo.dataverse.org/api/datasets/24/versions/1.0/files?orderCriteria=Newest"
View Dataset Files and Folders as a Directory Index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -2702,6 +2725,85 @@ The fully expanded example above (without environment variables) looks like this
Note: The ``id`` returned in the json response is the id of the file metadata version.

Getting File Data Tables
~~~~~~~~~~~~~~~~~~~~~~~~

This endpoint is oriented toward tabular files and provides a JSON representation of the file data tables for an existing tabular file. ``ID`` is the database id of the file to get the data tables from or ``PERSISTENT_ID`` is the persistent id (DOI or Handle) of the file.

A curl example using an ``ID``

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=24
curl $SERVER_URL/api/files/$ID/dataTables
The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash
curl https://demo.dataverse.org/api/files/24/dataTables
A curl example using a ``PERSISTENT_ID``

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_ID=doi:10.5072/FK2/AAA000
curl "$SERVER_URL/api/files/:persistentId/dataTables?persistentId=$PERSISTENT_ID"
The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash
curl "https://demo.dataverse.org/api/files/:persistentId/dataTables?persistentId=doi:10.5072/FK2/AAA000"
Note that if the requested file is not tabular, the endpoint will return an error.

.. _file-download-count:

Getting File Download Count
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Provides the download count for a particular file, where ``ID`` is the database id of the file to get the download count from or ``PERSISTENT_ID`` is the persistent id (DOI or Handle) of the file.

A curl example using an ``ID``

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=24
curl "$SERVER_URL/api/files/$ID/downloadCount"
The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash
curl "https://demo.dataverse.org/api/files/24/downloadCount"
A curl example using a ``PERSISTENT_ID``

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_ID=doi:10.5072/FK2/AAA000
curl "$SERVER_URL/api/files/:persistentId/downloadCount?persistentId=$PERSISTENT_ID"
The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash
curl "https://demo.dataverse.org/api/files/:persistentId/downloadCount?persistentId=doi:10.5072/FK2/AAA000"
If you are interested in download counts for multiple files, see :doc:`/api/metrics`.

Updating File Metadata
~~~~~~~~~~~~~~~~~~~~~~
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,23 @@ public class DatasetVersionServiceBean implements java.io.Serializable {
private static final Logger logger = Logger.getLogger(DatasetVersionServiceBean.class.getCanonicalName());

private static final SimpleDateFormat logFormatter = new SimpleDateFormat("yyyy-MM-dd'T'HH-mm-ss");


private static final String QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_LABEL = "SELECT fm FROM FileMetadata fm"
+ " WHERE fm.datasetVersion.id=:datasetVersionId"
+ " ORDER BY fm.label";
private static final String QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_DATE = "SELECT fm FROM FileMetadata fm, DvObject dvo"
+ " WHERE fm.datasetVersion.id = :datasetVersionId"
+ " AND fm.dataFile.id = dvo.id"
+ " ORDER BY CASE WHEN dvo.publicationDate IS NOT NULL THEN dvo.publicationDate ELSE dvo.createDate END";
private static final String QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_SIZE = "SELECT fm FROM FileMetadata fm, DataFile df"
+ " WHERE fm.datasetVersion.id = :datasetVersionId"
+ " AND fm.dataFile.id = df.id"
+ " ORDER BY df.filesize";
private static final String QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_TYPE = "SELECT fm FROM FileMetadata fm, DataFile df"
+ " WHERE fm.datasetVersion.id = :datasetVersionId"
+ " AND fm.dataFile.id = df.id"
+ " ORDER BY df.contentType";

@EJB
DatasetServiceBean datasetService;

Expand Down Expand Up @@ -149,7 +165,19 @@ public DatasetVersion getDatasetVersion(){
return this.datasetVersionForResponse;
}
} // end RetrieveDatasetVersionResponse


/**
* Different criteria to sort the results of FileMetadata queries used in {@link DatasetVersionServiceBean#getFileMetadatas}
*/
public enum FileMetadatasOrderCriteria {
NameAZ,
NameZA,
Newest,
Oldest,
Size,
Type
}

public DatasetVersion find(Object pk) {
return em.find(DatasetVersion.class, pk);
}
Expand Down Expand Up @@ -1224,4 +1252,50 @@ public List<DatasetVersion> getUnarchivedDatasetVersions(){
return null;
}
} // end getUnarchivedDatasetVersions

/**
* Returns a FileMetadata list of files in the specified DatasetVersion
*
* @param datasetVersion the DatasetVersion to access
* @param limit for pagination, can be null
* @param offset for pagination, can be null
* @param orderCriteria a FileMetadatasOrderCriteria to order the results
* @return a FileMetadata list of the specified DatasetVersion
*/
public List<FileMetadata> getFileMetadatas(DatasetVersion datasetVersion, Integer limit, Integer offset, FileMetadatasOrderCriteria orderCriteria) {
TypedQuery<FileMetadata> query = em.createQuery(getQueryStringFromFileMetadatasOrderCriteria(orderCriteria), FileMetadata.class)
.setParameter("datasetVersionId", datasetVersion.getId());
if (limit != null) {
query.setMaxResults(limit);
}
if (offset != null) {
query.setFirstResult(offset);
}
return query.getResultList();
}

private String getQueryStringFromFileMetadatasOrderCriteria(FileMetadatasOrderCriteria orderCriteria) {
String queryString;
switch (orderCriteria) {
case NameZA:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_LABEL + " DESC";
break;
case Newest:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_DATE + " DESC";
break;
case Oldest:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_DATE;
break;
case Size:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_SIZE;
break;
case Type:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_TYPE;
break;
default:
queryString = QUERY_STR_FIND_ALL_FILE_METADATAS_ORDER_BY_LABEL;
break;
}
return queryString;
}
} // end class
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import edu.harvard.iq.dataverse.authorization.users.User;
import edu.harvard.iq.dataverse.dataaccess.DataAccess;
import edu.harvard.iq.dataverse.dataaccess.StorageIO;
import edu.harvard.iq.dataverse.engine.command.DataverseRequest;
import edu.harvard.iq.dataverse.engine.command.exception.CommandException;
import edu.harvard.iq.dataverse.engine.command.impl.CreateGuestbookResponseCommand;
import edu.harvard.iq.dataverse.engine.command.impl.RequestAccessCommand;
Expand Down Expand Up @@ -571,5 +572,15 @@ public String getDirectStorageLocatrion(String storageLocation) {

return null;
}


/**
* Checks if the DataverseRequest, which contains IP Groups, has permission to download the file
*
* @param dataverseRequest the DataverseRequest
* @param dataFile the DataFile to check permissions
* @return boolean
*/
public boolean canDownloadFile(DataverseRequest dataverseRequest, DataFile dataFile) {
return permissionService.requestOn(dataverseRequest, dataFile).has(Permission.DownloadFile);
}
}
19 changes: 18 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/api/Access.java
Original file line number Diff line number Diff line change
Expand Up @@ -1945,5 +1945,22 @@ private URI handleCustomZipDownload(User user, String customZipServiceUrl, Strin
throw new BadRequestException();
}
return redirectUri;
}
}

@GET
@AuthRequired
@Path("/datafile/{id}/userPermissions")
public Response getUserPermissionsOnFile(@Context ContainerRequestContext crc, @PathParam("id") String dataFileId) {
DataFile dataFile;
try {
dataFile = findDataFileOrDie(dataFileId);
} catch (WrappedResponse wr) {
return wr.getResponse();
}
JsonObjectBuilder jsonObjectBuilder = Json.createObjectBuilder();
User requestUser = getRequestUser(crc);
jsonObjectBuilder.add("canDownloadFile", fileDownloadService.canDownloadFile(createDataverseRequest(requestUser), dataFile));
jsonObjectBuilder.add("canEditOwnerDataset", permissionService.userOn(requestUser, dataFile.getOwner()).has(Permission.EditDataset));
return ok(jsonObjectBuilder);
}
}
14 changes: 11 additions & 3 deletions src/main/java/edu/harvard/iq/dataverse/api/Datasets.java
Original file line number Diff line number Diff line change
Expand Up @@ -488,9 +488,17 @@ public Response getVersion(@Context ContainerRequestContext crc, @PathParam("id"
@GET
@AuthRequired
@Path("{id}/versions/{versionId}/files")
public Response getVersionFiles(@Context ContainerRequestContext crc, @PathParam("id") String datasetId, @PathParam("versionId") String versionId, @Context UriInfo uriInfo, @Context HttpHeaders headers) {
return response( req -> ok( jsonFileMetadatas(
getDatasetVersionOrDie(req, versionId, findDatasetOrDie(datasetId), uriInfo, headers).getFileMetadatas())), getRequestUser(crc));
public Response getVersionFiles(@Context ContainerRequestContext crc, @PathParam("id") String datasetId, @PathParam("versionId") String versionId, @QueryParam("limit") Integer limit, @QueryParam("offset") Integer offset, @QueryParam("orderCriteria") String orderCriteria, @Context UriInfo uriInfo, @Context HttpHeaders headers) {
return response( req -> {
DatasetVersion datasetVersion = getDatasetVersionOrDie(req, versionId, findDatasetOrDie(datasetId), uriInfo, headers);
DatasetVersionServiceBean.FileMetadatasOrderCriteria fileMetadatasOrderCriteria;
try {
fileMetadatasOrderCriteria = orderCriteria != null ? DatasetVersionServiceBean.FileMetadatasOrderCriteria.valueOf(orderCriteria) : DatasetVersionServiceBean.FileMetadatasOrderCriteria.NameAZ;
} catch (IllegalArgumentException e) {
return error(Response.Status.BAD_REQUEST, "Invalid order criteria: " + orderCriteria);
}
return ok(jsonFileMetadatas(datasetversionService.getFileMetadatas(datasetVersion, limit, offset, fileMetadatasOrderCriteria)));
}, getRequestUser(crc));
}

@GET
Expand Down
Loading

0 comments on commit 3b7d824

Please sign in to comment.