Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rest: add sharing parameters to get_workflows #545

Closed
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions docs/openapi.json
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,27 @@
"name": "workflow_id_or_name",
"required": false,
"type": "string"
},
{
"description": "Optional flag to list all shared (owned and unowned) workflows.",
"in": "query",
"name": "shared",
"required": false,
"type": "boolean"
},
{
"description": "Optional argument to list workflows shared by the specified user(s).",
"in": "query",
"name": "shared_by",
"required": false,
"type": "string"
},
{
"description": "Optional argument to list workflows shared with the specified user(s).",
"in": "query",
"name": "shared_with",
"required": false,
"type": "string"
}
],
"produces": [
Expand Down Expand Up @@ -167,9 +188,15 @@
"name": {
"type": "string"
},
"owner_email": {
"type": "string"
},
"progress": {
"type": "object"
},
"shared_with": {
"type": "string"
},
"size": {
"properties": {
"human_readable": {
Expand Down
140 changes: 137 additions & 3 deletions reana_workflow_controller/rest/workflows.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,14 @@
import logging
import re
from typing import Optional
from uuid import uuid4
from uuid import UUID, uuid4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think UUID is not needed?


from flask import Blueprint, jsonify, request
from reana_commons.config import WORKFLOW_TIME_FORMAT
from reana_db.database import Session
from reana_db.models import RunStatus, User, UserWorkflow, Workflow, WorkflowResource
from reana_db.utils import (
_get_workflow_by_uuid,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not needed?

_get_workflow_with_uuid_or_name,
build_workspace_path,
get_default_quota_resource,
Expand Down Expand Up @@ -64,6 +65,9 @@
"user": fields.String(required=True),
"verbose": fields.Bool(missing=False),
"workflow_id_or_name": fields.String(),
"shared": fields.Bool(missing=False),
"shared_by": fields.String(),
"shared_with": fields.DelimitedList(fields.String()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should multiple values of shared_with be interpreted as the workflows shared with all the given users or with any of them? This also changes a bit the semantics of anybody and nobody

},
location="query",
)
Expand Down Expand Up @@ -138,6 +142,21 @@ def get_workflows(args, paginate=None): # noqa
description: Optional analysis UUID or name to filter.
required: false
type: string
- name: shared
in: query
description: Optional flag to list all shared (owned and unowned) workflows.
required: false
type: boolean
- name: shared_by
in: query
description: Optional argument to list workflows shared by the specified user(s).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one user can be specified, so user(s) -> user

required: false
type: string
- name: shared_with
in: query
description: Optional argument to list workflows shared with the specified user(s).
required: false
type: string
responses:
200:
description: >-
Expand Down Expand Up @@ -175,6 +194,10 @@ def get_workflows(args, paginate=None): # noqa
launcher_url:
type: string
x-nullable: true
owner_email:
type: string
shared_with:
type: string
Comment on lines +199 to +200
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a list, not a string

examples:
application/json:
[
Expand Down Expand Up @@ -258,14 +281,90 @@ def get_workflows(args, paginate=None): # noqa
include_progress: bool = args.get("include_progress", verbose)
include_workspace_size: bool = args.get("include_workspace_size", verbose)
workflow_id_or_name: Optional[str] = args.get("workflow_id_or_name")
shared: bool = args.get("shared")
shared_by: Optional[str] = args.get("shared_by")
shared_with: Optional[str] = args.get("shared_with")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be List[str] and not Optional[str]


try:
if shared_by and shared_with:
return (
jsonify(
{
"message": "You cannot filter by shared_by and shared_with at the same time."
}
),
400,
)

user = User.query.filter(User.id_ == user_uuid).first()
if not user:
return jsonify({"message": "User {} does not exist".format(user_uuid)}), 404

workflows = []
query = user.workflows

if not shared and not shared_with and not shared_by:
# default case: retrieve owned workflows
query = user.workflows
else:
if shared or shared_by:
# retrieve owned workflows and unowned workflows shared by others
workflows_shared_with_user = Session.query(
user.shared_workflows.subquery().c.id_
)
query = Session.query(Workflow).filter(
or_(
Workflow.owner_id == user.id_,
Workflow.id_.in_(workflows_shared_with_user),
)
)
Comment on lines +311 to +319
Copy link
Member

@mdonadoni mdonadoni Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify the case shared = True to query = user.workflows.union(user.shared_workflows), right? (not tested, so some modifications might be needed to make it work with SQLAlchemy)

if shared_by and not shared:
if shared_by == "anybody":
# retrieve unowned workflows shared by anyone
query = query.filter(Workflow.id_.in_(workflows_shared_with_user))
Comment on lines +321 to +323
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the case shared_by = anybody simply query = user.shared_workflows?

else:
# retrieve unowned workflows shared by specific user
shared_by_user = (
Session.query(User).filter(User.email == shared_by).first()
)
if not shared_by_user:
return (
jsonify(
{
"message": f"User with email '{shared_by}' does not exist."
}
),
404,
)
query = query.filter(Workflow.owner_id == shared_by_user.id_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can use user.shared_workflows.filter(...), no need to use the query object of before (as that makes the logic quite complicated, given that this object is modified multiple times!)

if shared_with:
# starting point: retrieve owned workflows
query = user.workflows

first_shared_with = shared_with[0]
workflows_shared_by_user = Session.query(Workflow.id_).join(
UserWorkflow,
and_(
Workflow.id_ == UserWorkflow.workflow_id,
Workflow.owner_id == user.id_,
),
)

if first_shared_with == "nobody":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do nobody and anybody only make sense if they are provided on their own, without other emails? If so, we need to check that

# retrieve owned unshared workflows
query = query.filter(Workflow.id_.notin_(workflows_shared_by_user))
elif first_shared_with == "anybody":
# retrieve exclusively owned shared workflows
query = query.filter(Workflow.id_.in_(workflows_shared_by_user))
else:
# retrieve owned workflows shared with specific users
shared_with_users = Session.query(User.id_).filter(
User.email.in_(shared_with)
)
shared_with_workflows_ids = Session.query(
UserWorkflow.workflow_id
).filter(UserWorkflow.user_id.in_(shared_with_users))
query = query.filter(Workflow.id_.in_(shared_with_workflows_ids))
Comment on lines +305 to +366
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, we can consider shared, shared_with and shared_by as mutually exclusive, right?
If that's the case, then I would suggest changing all of this code to make it more clear, as now there are lots of different cases due to the many nested ifs.

In particular, we could first check if the user specified more than one of shared, shared_with and shared_by and if so return an error. Otherwise, we just have three cases, one for each flag. Something like this (not tested):

if shared is not None and shared_by is not None:
    return error
if shared is not None and shared_with is not None:
    return error
if shared_by is not None and shared_with is not None:
    return error

if shared;
    # ...
elif shared_by:
    # ...
elif shared_with: 
    # ...
else:
    # normal case

This should avoid most (if not all) nested ifs, making the logic a bit easier to understand.

Comment on lines +360 to +366
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not tested this, but I think this might work. Assuming that we want to find workflows that are shared with at least one of the users (as it is now in the code), we can simplify the case shared_with = x,y,z with something like:

user.workflows.filter(Workflow.shared_users.any(User.email.in_(shared_with)))

We should also check that the emails are valid, though (with another query).


if search:
search = json.loads(search)
search_val = search.get("name")[0]
Expand Down Expand Up @@ -295,7 +394,38 @@ def get_workflows(args, paginate=None): # noqa
elif sort in ["asc", "desc"]:
column_sorted = getattr(Workflow.created, sort)()
pagination_dict = paginate(query.order_by(column_sorted))

owner_ids = {workflow.owner_id for workflow in pagination_dict["items"]}
owners = dict(
Session.query(User.id_, User.email).filter(User.id_.in_(owner_ids)).all()
)

for workflow in pagination_dict["items"]:
owner_email = owners.get(workflow.owner_id, "-")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can avoid calling get, we can just use [...], as every workflow must have an owner


if owner_email == user.email:
owner_email = "-"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return the correct email also in this case. If we really don't want to return the user's own email, then let's return None and not "-", but I find this bizarre. What do you think?


if shared or shared_with:
shared_with_users = (
Session.query(User.email)
.join(
UserWorkflow,
and_(
User.id_ == UserWorkflow.user_id,
UserWorkflow.workflow_id == workflow.id_,
),
)
.all()
)
else:
shared_with_users = None
Comment on lines +409 to +422
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would simply check if the workflow is owned by the user, and if so you can find the list of emails with workflow.shared_users.with_entities(User.email).all() (or similar statements). Unless there is a good reason for only doing this when either shared or shared_with are provided?


if shared_with_users and owner_email == "-":
shared_with_emails = [user[0] for user in shared_with_users]
else:
shared_with_emails = "-"

workflow_response = {
"id": workflow.id_,
"name": get_workflow_name(workflow),
Expand All @@ -306,6 +436,10 @@ def get_workflows(args, paginate=None): # noqa
"progress": get_workflow_progress(
workflow, include_progress=include_progress
),
"owner_email": owner_email,
Copy link
Member

@mdonadoni mdonadoni Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we already have user in the response, maybe we can call this user_email?

"shared_with": ",".join(shared_with_emails)
if shared_with_emails != "-"
else "-",
Comment on lines +440 to +442
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to make this a string, we can return an array of strings here (and it will be deserialized correctly). If it's not shared with anybody, just return an empty list.

}
if type_ == "interactive" or verbose:
int_session = workflow.sessions.first()
Expand Down Expand Up @@ -1106,7 +1240,7 @@ def share_workflow(
"Message is too long. Please keep it under 5000 characters."
)

workflow = _get_workflow_with_uuid_or_name(workflow_id_or_name, str(user_id))
workflow = _get_workflow_with_uuid_or_name(workflow_id_or_name, user_id)

existing_share = (
Session.query(UserWorkflow)
Expand Down
Loading