Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚠️ chore: remove task group from task arguments #1635

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ClemDoum
Copy link
Contributor

@ClemDoum ClemDoum commented Jan 14, 2025

⚠️ might requires Redis migrations (I'm keen on learning how to do it)

Bug description

This PR aims at addressing some of the serialization issues/descrepencies between datashare.asynctasks and icij-worker.
More precisely, datashare.asynctasks systematically includes the Group inside the Task.args.

However inside icij-worker all task arguments are forwared to task functions, ending up in python TypeError because of this extra arguments.

datashare.asynctasks currently serializes the group inside task arguments, which implies that task functions (at least in Python) must be able to have a group argument as input. However task functions don't actually need the group argument to be performed. The Group is a datashare.asynctasks / icij-worker "internal". Task implementers shouldn't need to include the group argument inside their task function to make them work.

Proposed fix

I chose to align datashare.asynctasks behavior on the icij-worker one. Basically, when we create a task for the first time, we can assign a group to this task, hence, the TM has knowledge of the task group. The TM can keep track of task group rather than adding it to the task arguments.

Discarded fixes

Another solution I thought about was to fix this issue in Python by systematically removing the user and group arguments from the args received from the broker.

However I quickly discarded this options because it basically "reserves" the group argument name, and it's not available any longer as a task arg name.

An alternative solution whould have been to explore the Python task arguments and filter out all misnamed arguments, but I don't really like this approach as it can lead to silent failures.

Changes

⚠️ datashare-tasks

Changed

  • updated the TaskManager API to keep track of the Group. Rather than always calling save to create and then update tasks, we now use saveMetadata to persist a task and it group (and potentially other persistent metadata) for the first time, then we call persisUpdate(Task<V> task) to update the task itself, other persistent metadata remain untouched (basically once a task is saved with a group information, there's no reason to update it). This behavior is similar to the icij-worker behavior, where the group information is not persisted in task themselve, be it's the TM's responsibility to keep track of it.

@ClemDoum ClemDoum marked this pull request as ready for review January 14, 2025 16:54
@ClemDoum ClemDoum self-assigned this Jan 14, 2025
@ClemDoum ClemDoum changed the title chore: remove task group from task arguments ⚠️ chore: remove task group from task arguments Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant