⚠️ chore: remove task group from task arguments #1635
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Bug description
This PR aims at addressing some of the serialization issues/descrepencies between
datashare.asynctasks
andicij-worker
.More precisely,
datashare.asynctasks
systematically includes theGroup
inside theTask.args
.However inside
icij-worker
all task arguments are forwared to task functions, ending up in pythonTypeError
because of this extra arguments.datashare.asynctasks
currently serializes the group inside task arguments, which implies that task functions (at least in Python) must be able to have agroup
argument as input. However task functions don't actually need thegroup
argument to be performed. The Group is adatashare.asynctasks
/icij-worker
"internal". Task implementers shouldn't need to include thegroup
argument inside their task function to make them work.Proposed fix
I chose to align
datashare.asynctasks
behavior on theicij-worker
one. Basically, when we create a task for the first time, we can assign a group to this task, hence, the TM has knowledge of the task group. The TM can keep track of task group rather than adding it to the task arguments.Discarded fixes
Another solution I thought about was to fix this issue in Python by systematically removing the
user
andgroup
arguments from the args received from the broker.However I quickly discarded this options because it basically "reserves" the
group
argument name, and it's not available any longer as a task arg name.An alternative solution whould have been to explore the Python task arguments and filter out all misnamed arguments, but I don't really like this approach as it can lead to silent failures.
Changes
datashare-tasks
Changed
TaskManager
API to keep track of theGroup
. Rather than always callingsave
to create and then update tasks, we now usesaveMetadata
to persist a task and it group (and potentially other persistent metadata) for the first time, then we callpersisUpdate(Task<V> task)
to update the task itself, other persistent metadata remain untouched (basically once a task is saved with a group information, there's no reason to update it). This behavior is similar to theicij-worker
behavior, where the group information is not persisted in task themselve, be it's the TM's responsibility to keep track of it.